Protein arrays and methods of using and making the same

ABSTRACT

Methods and devices are provided for preparing a protein array having a plurality of proteins. In one embodiment, the method includes providing a plurality of nucleic acids each having a predefined sequence and expressing in vitro a plurality of proteins from the plurality of nucleic acids. In another embodiment, protein arrays having a solid surface and a microvolume are also provided. The solid surface can have a plurality of anchor oligonucleotides capable of hybridizing with a plurality of nucleic acids. The microvolume can cover each of the plurality of anchor oligonucleotides and can be configured to produce a polypeptide from each of the plurality of nucleic acids.

RELATED APPLICATIONS

This application is a divisional application of U.S. application Ser.No. 16/578,488, filed Sep. 23, 2019, which is a divisional applicationof U.S. application Ser. No. 13/884,446, filed Jul. 29, 2013, now issuedU.S. Pat. No. 10,457,935, which is a national phase application ofInternational Application No. PCT/ US2011/060217, filed Nov. 10, 2011,which claims priority to and the benefit of U.S. provisional applicationNo. 61/412,941, filed Nov. 12, 2010, the entire contents of each ofwhich are incorporated by reference herein.

FIELD OF THE INVENTION

Methods and devices provided herein generally relate to the preparationof high content gene libraries and libraries of polypeptides expressedtherefrom. More particularly, the methods and devices involvemicrovolume reactions, gene assembly, on-surface expression, highthroughput analysis, and/or pathway development on a solid support.

BACKGROUND

In vitro manipulation of nucleic acids and proteins is an importantaspect of modem molecular biology and functional genomics, Recentadvances in DNA synthesis have started to make the production of largescale DNA libraries an economical reality. However, the next step, theuse of those DNA libraries to produce large ordered protein librarieshas not scaled to the same extent. Synthetic DNA constructs encodinggenes are routinely cloned into plasmids that are then introduced intobacteria which are then in tum, used as hosts for protein production,While these techniques themselves are routine, they are time consumingand relatively limited in the potential scale of the library to begenerated. As for commercially available libraries, each component ofthe library must be grown up in the host bacteria separately. Thislimits the scalability of in vivo protein library expression as eachsample must be grown up in a minimal volume in order to produce enoughprotein to work with in downstream applications. In addition, growth ofthe host bacteria can be costly in time, taking a day or more of timefor incubation. Together, these issues limit the number of samples thatcan be effectively grown in

Accordingly, there is a need for high-throughput techniques for therapid synthesis and selection of genes and proteins of interest. Suchtechniques would permit the discovery and development of proteins withimproved properties that can be used for analytical, research,diagnostic, and therapeutic purposes.

SUMMARY

Methods and devices of the present invention relate to microvolumereactions, gene assembly, on-surface expression, high throughputanalysis, and/or pathway development on a solid support.

In one aspect, the present invention features a method for preparing aprotein array having a plurality of proteins. The method includesproviding a plurality of nucleic acids each having a predefinedsequence. The method further includes expressing in vitro a plurality ofproteins from the plurality of nucleic acids.

In certain embodiments, the method further includes measuring anactivity of each of the plurality of proteins. In some embodiments, theplurality of nucleic acids are produced on a solid surface. Theplurality of nucleic acids can each comprise a regulatory geneticsequence. In certain embodiments, expressing inn vitro can be performedinn a micro-well plate or at discrete features on support or a solidsurface.

The present invention, in a second aspect, features a method forpreparing a protein array having a plurality of proteins. The methodincludes providing a microvolume comprising a population of nucleicacids. In some embodiments, the population of nucleic acids can beimmobilized at discrete features of the support buy hybridizing thepopulation of nucleic acids onto support-bound anchor oligonucleotides.The population of nucleic acids can have a plurality of distinct,predefined sequences. The method further includes expressing in vitro insaid microvolume a plurality of proteins from the population of nucleicacids.

In one aspect, the present invention features a method for producing atleast one protein, the method comprising (a) providing a support havinga plurality of distinct features, each feature comprising a plurality ofimmobilized anchor oligonucleotides; (b) generating at least oneplurality of nucleic acid having a predefined sequence onto theplurality of anchor oligonucleotides; (c) providing a microvolume ontoat least one feature of the support; and (d) expressing in vitro in themicrovolume the at least one protein from the at least one nucleic acid.

The microvolume can comprise reagents appropriate for expressing invitro the at least one protein from the at least one nucleic acid. Insome embodiments, each feature of the support comprises a distinctplurality of support-bound anchor oligonucleotides wherein the 5′ end ofeach of the plurality of anchor oligonucleotide is complementary to the5′ end of a distinct nucleic acid having a predefined. sequence. Insonic embodiments, the plurality of nucleic acids are generated byassembling a plurality of construction oligonucleotides comprisingpartially overlapping sequences that define the sequence of the at leastone nucleic acid. In some embodiments, the at least one nucleic acid isgenerated under (i) ligation conditions, (ii) chain extensionconditions, or (iii) chain extension and ligation conditions. In someembodiments, the method further comprises verifying the at least onenucleic acid sequence prior to the step of expressing the protein(s). Insome embodiments, the method further comprises synthesizing a pluralityof partially overlapping construction oligonucleotides, wherein eachconstruction oligonucleotide is synthesized at a distinct feature of thesupport comprising immobilized complementary constructionoligonucleotides, releasing the construction oligonucleotides in atleast one microvolume, and transferring the at least one microvolume toa feature comprising a plurality of anchor oligonucleotides.

In some aspects of the invention, the methods and devices may be used toproduce at least 100, 1,000, 10,000 or more proteins. In sonicembodiments, the proteins are proteins variants. In some embodiments,the method further comprises screening the at least one protein toidentify proteins having a desired characteristic.

In another aspect, protein arrays having a solid surface and amicrovolume are also provided. The solid surface can have a plurality ofanchor oligonucleotides capable of hybridizing with a plurality ofnucleic acids. The microvolume can cover each of the plurality of anchoroligonucleotides and can be configured to produce a polypeptide fromeach of the plurality of nucleic acids.

In another aspect, proteins arrays comprises (a) a first plurality offeatures on a. support, each of the first plurality of featurescomprising a plurality of immobilized single stranded oligonucleotides,wherein the plurality of single stranded oligonucleotides comprisespartially overlapping sequences that define the sequence of each of aplurality of nucleic acid molecules encoding a plurality of proteins;and (b) a second plurality of features, the second plurality of featurescomprising a plurality of anchor oligonucleotides having a sequencecomplernentar terminus sequence of each of the plurality nucleic acids.

BRIEF DESCRIPTION OF THE DRAWINGS

FIGS. 1A-1D illustrate an embodiment of the method and device for genesynthesis. FIG. 1A illustrates a microarray (10) having groups (20) ofsingle-stranded DNA (40). FIG. IA illustrates introduction of universalprimer (30) and polymerase (50) to create a complementary second strandof DNA (60), and of enzyme (70) to fragment primer. FIG. 1B illustratesrelease of a plurality of populations of oligonucleotides (65) insolution. FIG. 1C illustrates the transfer of eluted oligonucleotides(65) to an area having anchor oligonucleotides A (90) immobilized on itssurface. FIG. 1D illustrates anchor oligonucleotides A (90) acting aspoints of nucleation for the construction of the DNA product (100) onthe surface of the microarray (10).

FIGS. 2A-2C illustrate an embodiment of the method and device foron-surface expression of proteins. FIG. 2A illustrates a nucleic acidhaving a T7 promoter (100), an E coli ribosome binding site (110)upstream of a gene of interest (120), and a transcriptional terminator(130) downstream of the gene of interest (120). FIG. 2B illustrates alibrary of synthetic nucleic acid constructs integrated into a plasmid(140) or in linear form (150) being transferred to individual wells of amicro-well plate containing the appropriate transcription/translationreaction reagents (160), and expression of the proteins (170) encoded bythe gene library. FIG. 2C illustrates a droplet based dispensingapparatus (200) to dispense droplets (210) of transcription/translationreagents directly onto specific locations of the solid surface (180) andcovering the deposited nucleic acid constructs (190) forming aself-contained reaction volume (220). FIG. 2C illustrates the solidsurface with the reaction mixtures inserted into a humidity controlledchamber (230) and expression of the proteins (170) from tine nucleicacid library.

FIGS. 3A-3D illustrate an embodiment of the method and device for highthroughput analysis of on-surface expressed proteins. FIG. 3Aillustrates incubation of solid support (310) having synthetic nucleicacid constructs (300) covered with a droplet oftranscription/translation reagent mix (320) in a humidity controlledchamber (330). FIG. 3B illustrates protein expression phase using adroplet based dispensing apparatus (340) to dispense reagents in dropwise manner at specific location (or features) on the support or solidsurface (310). FIG. 3C illustrates conversion of substrate (360) toproduct (370) in presence of enzyme (355). FIG. 3D illustrates samplingand assaying the presence of product (370).

FIGS. 4A-4D illustrate an embodiment of the method and device formulti-protein pathway development. FIG. 4A illustrates a microarray(400) containing the oligonucleotides (410) to build a set of genes(420). FIG. 4B illustrates mixing linear nucleic acid constructs (440)to give all the potential combinations (460). FIG. 4C illustratesproduction of protein combinations (475). FIG. 4D illustrates conversionof substrates (492) to product (495).

DETAILED DESCRIPTION OF TIIE. NVENTION

As used herein, the term “gene” refers to a nucleic acid that containsinformation necessary for expression of a polypeptide, protein, oruntranslated RNA (e.g., rRNA, tRNA, anti-sense RNA). When the geneencodes a protein, it includes the promoter and the structural gene openreading frame sequence (ORF), as well as other sequences involved in theexpression of the protein. When the gene encodes an untranslated RNA, itincludes the promoter and the nucleic acid that encodes the untranslatedRNA.

The term “gene of interest” (GOI) refers to any nucleotide sequence(e.g., RNA or DNA), the manipulation of which may be deemed desirablefor any reason (e.g., treat disease, confer improved qualities,expression of a protein of interest in a host cell, expression of aribozyme, etc.), by one of ordinary skill in the art. Such nucleotidesequences include, but are not limited to, coding sequences ofstructural genes (e.g., reporter genes, selection marker genes,oncogenes, drug resistance genes, growth factors, etc.), and non-codingregulatory sequences which do not encode an mRNA or protein product(e.g., promoter sequence, polyadenylation sequence, terminationsequence, enhancer sequence, etc.).

As used herein, the phrase “nucleic acids” or “nucleic acidmolecule”refers to a sequence of contiguous nucleotides (riboNTPs,dNTPs, ddNTPs or combinations thereof) of any length. A nucleic acidmolecule may encode a full-length polypeptide or a fragment of anylength thereof, or may be non-coding. As used herein, the terms “nucleicacids,” “nucleic acid molecule”, “polynucleotide” and “oligonucleotide”may be used interchangeably and include both single-stranded (ss) anddouble-stranded (ds) RNA, DNA and RNA:DNA hybrids.

Nucleic acid sequences that are “complementary” are those that arecapable of base-pairing according to the standard Watson-Crickcomplementarity rules. As used herein, the term “complementarysequences” means nucleic acid sequences that are substantiallycomplementary, as may be assessed by the same nucleotide comparison setforth above, or as defined as being capable of hybridizing to thepolynucleotides that encode the protein sequences under stringentconditions, such as those described herein.

As used herein, a “polymerase” is an enzyme that catalyses synthesis ofnucleic acids using a preexisting nucleic acid template. Examplesinclude DNA polymerase (which catalyzes DNA 7DNA reactions), RNApolymerase (DNA7RNA) and reverse transcriptase (RNA7DNA).

As used herein, the term “polypeptide” refers to a sequence ofcontiguous amino acids of any length. The terms “peptide.”“oligopeptide,” or “protein” may be used interchangeably herein with theterm “polypeptide.”

As used herein, the terms “promoter,” “promoter element,” or “promotersequence” refer to a DNA sequence which when ligated to a nucleotidesequence of interest is capable of controlling the transcription of thenucleotide sequence of interest into mRNA. A promoter is typically,though not necessarily, located 5′ (i.e., upstream) of a nucleotidesequence of interest whose transcription into mRNA it controls, andprovides a site for specific binding by RNA polymerase and othertranscription factors for initiation of transcription. Promoters may beconstitutive or replatable. The term “constitutive” when made inreference to a promoter means that the promoter is capable of directingtranscription of an operably linked nucleic acid sequence in the absenceof a stimulus (e.g., heat shock, chemicals, etc.). In contrast, a“regulatable” promoter is one that is capable of directing a level oftranscription of an operably linked nucleic acid sequence in thepresence of a stimulus (e.g., heat shock, chemicals, etc.), which isdifferent from the level of transcription of the operably linked nucleicacid sequence in the absence of the stimulus.

One should appreciate that promoters have modular architecture and thatthe modular architecture may be altered, Bacterial promoters typicallyinclude a core promoter element and additional promoter elements. Thecore promoter refers to the minimal portion of the promoter required toinitiate transcription. A core promoter includes a Transcription StartSite, a binding site for RNA polymerases and general transcriptionfactor binding sites. The “transcription start site” refers to the firstnucleotide to be transcribed and is designated +1. Nucleotidesdownstream the start site are numbered +1, +2, etc., and nucleotidesupstream the start site are numbered −1, −2, etc. Additional promoterelements are located 5′ (i.e. typically 30-250 bp upstream the startsite) of the core promoter and regulate the frequency of thetranscription. The proximal promoter elements and the distal promoterelements constitute specific transcription factor site. In prokaryotes,a core promoter usually includes two consensus sequences, a −10 sequenceor a −35 sequence, which are recognized by sigma factors (see, forexample, Hawley; D. K. et at (1983) Nucl. Acids Res. 11, 2237-2255). The−10 sequence (10 by upstream from the first transcribed nucleotide) istypically about 6 nucleotides in length and is typically made up of thenucleotides adenosine and thymidine (also known as the Pribnow box). Insome embodiments, the nucleotide sequence of the −10 sequence is5′-TATAAT or may comprise 3 to 6 bases pairs of the consensus sequence.The presence of this box is essential to the start of the transcription.The −35 sequence of a core promoter is typically about 6 nucleotides inlength. The nucleotide sequence of the −35 sequence is typically made upof the each of the four nucleosides. The presence of this sequenceallows a very high transcription rate. In sonic embodiments, thenucleotide sequence of the −35 sequence is 5′-TTGACA or may comprise 3to 6 bases pairs of the consensus sequence. In some embodiments, the −10and the −35 sequences are spaced by about 17 nucleotides. Eukaryoticpromoters are more diverse than prokaryotic promoters and may be locatedseveral kilobases upstream of the transcription starting site. Someeukaryotic promoters contain a TATA box (e.g. containing the consensussequence TATAAA or part thereof), which is located typically within 40to 12.0 bases of the transcriptional start site. One or more upstreamactivation sequences (UAS), which are recognized by specific bindingproteins can act as activators of the transcription. Theses UASsequences are typically found upstream of the transcription initiationsite. The distance between the UAS sequences and the TATA box is highlyvariable and may be up to 1 kb.

As used herein, the terms protein of interest (POI) and “desiredprotein” refer to a polypeptide under study, or whose expression isdesired by one practicing the methods disclosed herein. A protein ofinterest is encoded by its cognate gene of interest (GOI). The identityof a POI can be known or not known. A POI can be a polypeptide encodedby an open reading frame,

As used herein, unless otherwise stated, the term “transcription” refersto the synthesis of RNA from a DNA template; the term “translation”refers to the synthesis of a polypeptide from an mRNA template.Translation in general is regulated by the sequence and structure of the5′ untranslated region (UTR) of the mRNA transcript. One regulatorysequence is the ribosome binding site (RBS), which promotes efficientand accurate translation of mRNA. The prokaryotic RBS is theShine-Dalgarno sequence, a purine-rich sequence of 5′ UTR that iscomplementary to the UCCU core sequence of the 3′-end of 16S rRNA(located within the 30S small ribosomal subunit). Various Shine-Dalgarnosequences have been found in prokaryotic mRNAs and generally lie about10 nucleotides upstream from the AUG start codon. Activity of a RBS canbe influenced by the length and nucleotide composition of the spacerseparating the RBS and the initiator AUG. In eukaryotes, the Kozaksequence A/GCCACCAUGG (SEQ ID NO. 1), which lies within a short 5°untranslated region, directs translation of mRNA. An mRNA lacking theKozak consensus sequence may also be translated efficiently in an invitro systems if it possesses a moderately long 5′ UTR that lacks stablesecondary structure. While E. coli ribosome preferentially recognizesthe Shine-Dalgarno sequence, eukaryotic ribosomes (such as those foundin retic lysate) can efficiently use either the Shine-Dalgarno or theKozak ribosomal binding sites.

As used herein, the tetra “vector” refers to any genetic element, suchas a plasmid, phage, transposon, cosmid, chromosome, virus, virion,etc., which is capable of replication when associated with the propercontrol elements and which can transfer gene sequences between cells.The vector may contain a marker suitable for use in the identificationof transformed cells. For example, markers may provide tetracyclineresistance or ampicillin resistance. Types of vectors include cloningand expression vectors. As used herein, the term “cloning vector” refersto a plasmid or phage DNA or other DNA sequence which is able toreplicate autonomously in a host cell and which is characterized by oneor a small number of restriction endonuclease recognition sites and/orsites for site-specific recombination. A foreign DNA fragment may bespliced into the vector at these sites in order to bring about thereplication and cloning of the fragment. The term “expression vector”refers to a vector which is capable of expressing a gene that has beencloned into it. Such expression can occur after transformation into ahost cell, or in IVPS systems. The cloned DNA is usually operably linkedto one or more regulatory sequences, such as promoters, repressorbinding sites, terminators, enhancers and the like. The promotersequences can be constitutive, inducible and/or repressible.

As used herein, the term “host” refers to any prokaryotic or eukaryoticmammalian, insect, yeast, plant, avian, animal, etc.)t organism that isa recipient of a replicable expression vector, cloning vector or anynucleic acid molecule. The nucleic acid molecule may contain, but is notlimited to, a sequence of interest, a transcriptional regulatorysequence (such as a promoter, enhancer, repressor, and the like and/oran origin of replication. As used herein, the terms “host,” “host cell,”“recombinant host” and “recombinant host cell” may be usedinterchangeably. For examples of such hosts, see Sambrook, et al.,Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory,Cold Spring Harbor, N.Y.

As used herein, “in vitro” refers to systems outside a cell or organismand may sometimes be referred to cell free system. in vivo systemsrelate to essentially intact cells whether in suspension or attached toor in contact with other cells or a solid. In vitro systems have anadvantage of being more readily manipulated. For example, deliveringcomponents to a cell interior is not a concern; manipulationsincompatible with continued cell function are also possible. However, invitro systems involve disrupted cells or the use of various componentsto provide the desired function and thus spatial relationships of thecell are lost. When art in vitro system is prepared, components,possibly critical to the desired activity can be lost with discardedcell debris. Thus in vitro systems are more manipulatable and canfunction differently from in vivo systems. in some embodiments, hybridin vitro/in vivo systems can also be used.

The terms “in vitro transcription” (IVT) and “cell-free transcription”are used interchangeably herein and are intended to refer to any methodfor cell-free synthesis of RNA from DNA without synthesis of proteinfrom the RNA. A preferred RNA is messenger RNA (mRNA), which encodesproteins. The terms “in vitro transcription-translation” (IVTT),“cell-free transcription-translation”, “DNA template-driven in vitroprotein synthesis” and “DNA template-driven cell-free protein synthesis”are used interchangeably herein and are intended to refer to any methodfor cell-free synthesis of mRNA from DNA (transcription) and of proteinfrom mRNA (translation). The terms “in vitro protein synthesis” (IVPS),“in vitro translation”, “cell-free translation”, “RNA template-driven invitro protein synthesis”, “RNA template-driven cell-free proteinsynthesis” and “cell-free protein synthesis” are used interchangeablyherein and are intended to refer to any method for cell-free synthesisof a protein. IVTT, including coupled transcription and transcription,is one non-limiting example of IVPS.

As used herein the terms “nucleic acid”, “polynucleotide”,“oligonucleotide” are used interchangeably and refer tonaturally-occurring or synthetic polymeric forms of nucleotides. Theoligonucleotides and nucleic acid molecules of the present invention maybe formed from naturally occurring nucleotides, for example formingdeoxyribonucleic acid (DNA) or ribonucleic acid (RNA) molecules.Alternatively, the naturally occurring oligonucleotides may includestructural modifications to alter their properties, such as in peptidenucleic acids (PNA) or in locked nucleic acids (LNA). The solid phasesynthesis of oligonucleotides and nucleic acid molecules with naturallyoccurring or artificial bases is well known in the art. The terms shouldbe understood to include equivalents, analogs of either RNA or DNA madefrom nucleotide analogs and as applicable to the embodiment beingdescribed, single-stranded or double-stranded polynucleotides.Nucleotides useful in the invention include, for example,naturally-occurring nucleotides (for example, ribonucleotides ordeoxyribonucleotides), or natural or synthetic modifications ofnucleotides, or artificial bases.

As used herein, the term monomer refers to a member of a set of smallmolecules which are and can be joined together to form an oligomer, apolymer or a compound composed of two or more members. The particularordering of monomers within a polymer is referred to herein as the“sequence” of the polymer. The set of monomers includes but is notlimited to, for example, the set of common L-amino acids, the set ofD-amino acids, the set of synthetic and/or natural amino acids, the setof nucleotides and the set of pentoses and hexoses. Aspects of theinvention described herein primarily with regard to the preparation ofoligonucleotides, but could readily be applied in the preparation ofother polymers such as peptides or polypeptides, polysaccharides,phospholipids, heteropolymers, polyesters, polycarbonates, polyureas,polyamides, polyethyleneimines, polyarylene sulfides, polysilaxxanes,polyimides, polyacetates, or any other other polymers.

As used herein, the term “predefined sequence” means that the sequenceof the polymer is designed or known and chosen before synthesis orassembly of the polymer. In particular, aspects of the invention isdescribed herein primarily with regard to the preparation of nucleicacids molecules, the sequence of the oligonucleotides or polynucleotidesbeing known and. chosen before the synthesis or assembly of the nucleicacid molecules. In some embodiments of the technology provided herein,immobilized oligonucleotides or polynucleotides are used as a source ofmaterial. In various embodiments, the methods described herein useoligonucleotides, their sequence being determined based on the sequenceof the final polynucleotides constructs to be synthesized. In oneembodiment, oligonucleotides are short nucleic acid molecules. Forexample, oligonucleotides may be from 10 to about 300 nucleotides, from20 to about 400 nucleotides, from 30 to about 500 nucleotides, from 40to about 600 nucleotides, or more than about 600 nucleotides long.However, shorter or longer oligonucleotides may be used.Oligonucleotides may be designed to have different lengths. In someembodiments, the sequence of the polynucleotide construct may be dividedup into a plurality of shorter sequences that can be synthesized inparallel and assembled into a single or a plurality of desiredpolynucleotide constructs using the methods described herein.

As used herein, the term “genome” refers to the whole hereditaryinformation of an organism that is encoded in the DNA (or RNA forcertain viral species) including both coding and non-coding sequences.In various embodiments, the term may include the chromosomal. DNA of anorganism and/or DNA that is contained in an organelle such as, forexample, the mitochondria or chloroplasts and/or extrachromosomalplasmid and/or artificial chromosome. A “native gene” refers to a genethat is native to the host cell with its own regulatory sequenceswhereas an “exogenous gene” or “heterologous gene” refers to any genethat is not a native gene, comprising regulatory and/or coding sequencesthat are not native to the host cell, In some embodiments, anheterologous gene may comprise mutated sequences or part of regulatoryand/or coding sequences. In some embodiments, the regulatory sequencesmay be heterologous or homologous to a gene of interest. An heterologousregulatory sequence does not function in nature to regulate the samegenes) it is regulating in the transformed host cell, “Coding sequence”refers to a DNA sequence coding for a specific amino acid sequence. Asused herein, “regulatory sequences” refer to nucleotide sequenceslocated upstream (5′ non-coding sequences), within, or downstream (3′non-coding sequences) of a coding sequence, and which influence thetranscription, RNA processing or stability, or translation of theassociated coding sequence. Regulatory sequences may include promoters,translation leader sequences, RNA processing site, effector binding siteand stern-loop structure.

As described herein, a genetic element may be any coding or non-codingnucleic acid sequence. In some embodiments, a genetic element is anucleic acid that codes for an amino acid, a peptide or a protein.Genetic elements may be operons, genes, gene fragments, promoters,exons, introns, etc. or any combination thereof. Genetic elements can beas short as one or a few codons or may be longer including functionalcomponents (e.g. encoding proteins) and/or regulatory components. Insome embodiments, a genetic element consists of an entire open readingframe of a protein, or consists of the entire open reading frame and oneor more (o all) regulatory sequences associated with that open readingframe. One skilled in the art will appreciate that the genetic elementscan be viewed as modular genetic elements or genetic modules. Forexample, a genetic module can comprise a regulator sequence or apromoter or a coding sequence or any combination thereof. In someembodiments, the genetic element comprises at least two differentgenetic modules and at least two recombination sites. In eukaryotes, thegenetic element can comprise at least three modules. For example, agenetic module can be a regulator sequence or a promoter, a codingsequence, and a polyadenlylation tail or any combination thereof. Inaddition to the promoter and the coding sequences, the nucleic acidsequence may comprise control modules including, but not limited to aleader sequence, a signal sequence and a transcription terminatorsequence. The leader sequence is a non-translated region operably linkedto the 5′ terminus of the coding nucleic acid sequence. The signalpeptide sequence codes for an amino acid sequence linked to the aminoterminus of the polypeptide which directs the polypeptide into thecell's secretion pathway.

Genetic elements or genetic modules may derive from the genome ofnatural organisms or from synthetic polynucleotides or from acombination thereof. In some embodiments, the genetic elements ormodules derive from different organisms. Genetic elements or modulesuseful for the methods described herein may be obtained from a varietyof sources such as, for example, DNA libraries, BAC libraries, de novochemical synthesis, or excision and modification of a genomic segment.The sequences obtained from such sources may then be modified usingstandard molecular biology and/or recombinant DNA technology to producepotynucleotide constructs having desired modifications forreintroduction into, or construction of, a large product nucleic acid,including a modified, partially synthetic or fully synthetic genome.Exemplary methods for modification of polynucleotide sequences obtainedfrom a genome or library include, for example, site directedmutagenesis; PCR mutagenesis; inserting, deleting or swapping portionsof a sequence using restriction enzymes optionally in combination withligation; in vitro or in vivo homologous recombination; andsite-specific recombination; or various combinations thereof. In otherembodiments, the genetic sequences useful in accordance with the methodsdescribed herein may be synthetic polynucleotides. Syntheticpolynucleotides may be produced using a variety of methods such as highthroughput oligonucleotide assembly techniques known in the art. Forexample, oligonucleotides having complementary, overlapping sequencesmay be synthesized on an array and then eluted or released from thearray. The oligonucleotides can then be induced to self assemble basedon hybridization of the complementary regions. In some embodiments, themethods involve one or more nucleic assembly reactions in order tosynthesize the genetic elements of interest. The method may use in vitroand/or in vivo nucleic acid assembly procedures. Non-limiting examplesof nucleic acid assembly procedures and library of nucleic acid assemblyprocedure are known in the art and can be found in, for example, U.S.patent applications 20060194214, 20070231805, 20070122817, 20070269870,20080064610, 20080287320, the disclosures of which are incorporated byreference.

In some embodiments, genetic elements sequence share less than 99%, lessthan 95%, less than 90%, less than 80%, less than 70% identity with anative or natural nucleic acid sequence, Identity can each be determinedby comparing a position in each sequence which may be aligned forpurposes of comparison. When an equivalent position in the comparedsequences is occupied by the same base or amino acid, then the moleculesare identical at that position; when the equivalent site occupied by thesame or a similar amino acid residue (e.g,, similar in steric and/orelectronic nature), then the molecules can be referred to as homologous(similar) at that position. Expression as a percentage of homology,similarity, or identity refers to a function of the number of identicalor similar amino acids at positions shared by the compared sequences,Expression as a percentage of homology, similarity, or identity refersto a function of the number of identical or similar amino acids atpositions shared by the compared sequences. Various alignment algorithmsand/or programs may be used, including FASTA, BLAST, or ENTREZ. FASTAand BLAST are available as a part of the GCG sequence analysis package(University of Wisconsin, Madison, Wis.), and can be used with, e.g.,default settings. ENTREZ is available through the National Center firBiotechnology Information, National Library of Medicine, NationalInstitutes of Health, Bethesda, Md. In one embodiment, the percentidentity of two sequences can be determined by the GCG program with agap weight of 1, e.g., each amino acid gap is weighted as if it were asingle amino acid or nucleotide mismatch between the two sequences.Other techniques for alignment are described in Methods in Enzymology,vol. 266: Computer Methods for Macromolecular Sequence. Analysis (1996),ed, Doolittle, Academic Press, Inc., a division of Harcourt Brace & Co.,San Diego, Calif., USA. Preferably, an alignment program that permitsgaps in the sequence is utilized to align the sequences. TheSmith-Waterman is one type of algorithm that permits gaps in sequencealignments. See Meth. Mol. Biol. 70: 173.-187 (1997). Also, the GAPprogram using the Needleman and Wunsch alignment method can be utilizedto align sequences. An alternative search strategy uses MPSRCH software,which runs on a MASPAR computer. MPSRCH uses a Smith-Waterman algorithmto score sequences on a massively parallel computer.

It should be appreciated that the nucleic acid sequence of interest orthe gene of interest may be derived from the genome of naturalorganisms. In some embodiments, genes of interest may be excised formthe genome of a natural organism or form the host genome, for example E.coli. It has been shown that it is possible to excise large genomicfragments by in vitro enzymatic excision and in vivo excision andamplification. For example, the FLP/FRT site specific recombinationsystem and the Cre/loxP site specific recombination systems have beenefficiently used for excision large genomic fragments for the purpose ofsequencing (see, Yoon et al., Genetic Analysis: BiomolecularEngineering, 1998, 14: 89-95), In some embodiments, excision andamplification techniques can be used to facilitate artificial genome orchromosome assembly. Genomic fragments may be excised form E. colichromosome and altered before being inserted into the host cellartificial genome or chromosome. in some embodiments, the excisedgenomic fragments can be assembled with engineered promoters andinserted into the genome of the host cell.

Other terms used in the fields of recombinant nucleic acid technologyand molecular and cell biology as used herein will be generallyunderstood by one of ordinary skill in the applicable arts.

Solid Supports

Some embodiments of the devices and methods provided herein useoligonucleotides that are immobilized on a support or substrate. As usedherein the term “support” and “substrate” are used interchangeably andrefers to a porous or non-porous solvent insoluble material on whichpolymers such as nucleic acids are synthesized or immobilized. As usedherein “porous” means that the material contains pores havingsubstantially uniform diameters (for example in the urn range). Porousmaterials include paper, synthetic filters and the like. In such porousmaterials, the reaction may take place within the pores. The support canhave any one of a number of shapes, such as pin, strip, plate, disk,rod, bends, cylindrical structure, particle, including bead,nanoparticle and the like. The support can have variable widths.

The support can be hydrophilic or capable of being rendered hydrophilicand includes inorganic powders such as silica, magnesium sulfate, andalumina; natural polymeric materials, particularly cellulosic materialsand materials derived from cellulose, such as fiber containing papers,e.g., filter paper, chromatographic paper, etc.; synthetic or modifiednaturally occurring polymers, such as nitrocellulose, cellulose acetate,poly (vinyl chloride), polyacrylamide, cross linked dextran, agarose,polyacrylate, polyethylene, polypropylene, poly (4-methylbutene),polystyrene, polymethacrylate, poly(ethylene terephthalate), nylon,polyvinyl butyrate), polyvinylidene difluoride (PVDF) membrane, glass,controlled pore glass, magnetic controlled pore glass, ceramics, metals,and the like etc.; either used by themselves or in conjunction withother materials.

In some embodiments, oligonucleotides are synthesized on an arrayformat. For example, single-stranded oligonucleotides are synthesized insitu on a common support wherein each oligonucleotide is synthesized ona separate or discrete feature (or spot) on the substrate. In preferredembodiments, single stranded oligonucleotides are bound to the surfaceof the support or feature. As used herein the term “array” refers to anarrangement of discrete features for storing, routing, amplifying andreleasing oligonucleotides or complementary oligonucleotides for furtherreactions. In a preferred embodiment, the support or array isaddressable: the support includes two or more discrete addressablefeatures at a particular predetermined location (i. e., are “address”)on the support. Therefore, each oligonucleotide molecule of the array islocalized to a known and defined location on the support, The sequenceof each oligonucleotide can be determined from its position on thesupport. Moreover, addressable supports or arrays enable the directcontrol of individual isolated volumes such as droplets. The size of thedefined feature can be chosen to allow formation of a microvolumedroplet on the feature, each droplet being kept separate from eachother. As described herein, features are typically, but need not be,separated by interfeature spaces to ensure that droplets between twoadjacent features do not merge. Interfeatures will typically not carryany oligonucleotide on their surface and will correspond to inert space.In some embodiments, features and interfeatures may differ in theirhydrophilicity or hydrophobicity properties, In some embodiments,features and interfeatures may comprise a modifier as described herein.

Arrays may be constructed, custom ordered or purchased from a commercialvendor (e.g., Agilent, Affymetrix, Nimblegen). Oligonucleotides areattached, spotted, immobilized, surface-bound, supported or synthesizedon the discrete features of the surface or array. Oligonucleotides maybe covalently attached to the surface or deposited on the surface.Various methods of construction are well known in the art e.g. masklessarray synthesizers, light directed methods utilizing masks, flow channelmethods, spotting methods etc.

In some embodiments, construction and/or selection oligonucleotides maybe synthesized on a solid support using maskless array synthesizer(MAS). Maskless array synthesizers are described, for example, in PCTapplication No. WO 99/42813 and in corresponding U.S. Pat. No.6,375,903. Other examples are known of maskless instruments which canfabricate a custom DNA microarray in which each of the features in thearray has a single-stranded DNA molecule of desired sequence.

Other methods for synthesizing construction and/or selectionoligonucleotides include, for example, light-directed methods utilizingmasks, flow channel methods, spotting methods, pin-based methods, andmethods utilizing multiple supports.

Light directed methods utilizing masks VLSIPS™ methods) for thesynthesis of oligonucleotides is described, for example, in U.S. Pat.Nos. 5,143,854; 5,510,270 and 5,527,681. These methods involveactivating predefined regions of a solid support and then contacting thesupport with a preselected monomer solution. Selected regions can beactivated by irradiation with a light source through a mask much in themanner of photolithography techniques used in integrated circuitfabrication. Other regions of the support remain inactive becauseillumination is blocked by the mask and they remain chemicallyprotected. Thus, a light pattern defines which regions of the supportreact with a given monomer. By repeatedly activating different sets ofpredefined regions and contacting different monomer solutions with thesupport, a diverse array of polymers is produced on the support. Othersteps, such as washing unreacted monomer solution from the support, canbe optionally used. Other applicable methods include mechanicaltechniques such as those described in U.S. Pat. No. 5,384,261.

Additional methods applicable to synthesis of construction and/orselection oligonucleotides on a single support are described, forexample, in U.S. Pat. No. 5,384,261. For example, reagents may bedelivered to the support by either (1) flowing within a channel definedon predefined regions or (2) “spotting” on predefined regions. Otherapproaches, as well as combinations of spotting and flowing, may beemployed as well. In each instance, certain activated regions of thesupport are mechanically separated from other regions when the monomersolutions are delivered to the various reaction sites. Flow channelmethods involve, for example, microfluidic systems to control synthesisof oligonucleotides on a solid support. For example, diverse polymersequences may be synthesized at selected regions of a solid support byforming flow channels on a surface of the support through whichappropriate reagents flow or in which appropriate reagents are placed.Spotting methods for preparation of oligonucleotides on a solid supportinvolve delivering reactants in relatively small quantities by directlydepositing them in selected regions. In some steps, the entire supportsurface can be sprayed or otherwise coated with a solution, if it ismore efficient to do so. Precisely measured aliquots of monomersolutions may be deposited dropwise by a dispenser that moves fromregion to region.

Pin-based methods for synthesis of oligonucleotides on a solid supportare described, fin example, in U.S. Pat. No. 5,288,514. Pin-basedmethods utilize a support having a plurality of pins or otherextensions. The pins are each inserted simultaneously into individualreagent containers in a tray. An array of 96 pins is commonly utilizedwith a 96-container tray, such as a 96-wells microtiter dish. Each trayis filled with a particular reagent for coupling in a particularchemical reaction on an individual pin. Accordingly, the trays willoften contain different reagents. Since the chemical reactions have beenoptimized such that each of the reactions can be performed under arelatively similar set of reaction conditions, it becomes possible toconduct multiple chemical coupling steps simultaneously.

In another embodiment, a plurality of oligonucleotides maybe synthesizedon multiple supports, One example is a bead based synthesis method whichis described, for example, in U.S. Pat. Nos. 5,770,358; 5,639,603; and5,541,061. For the synthesis of molecules such as oligonucleotides onbeads, a large plurality of beads is suspended in a suitable carrier(such as water) in a container. The beads are provided with optionalspacer molecules having an active site to which is complexed,optionally, a protecting group. At each step of the synthesis, the beadsare divided for coupling into a plurality of containers. After thenascent oligonucleotide chains are deprotected, a different monomersolution is added to each container, so that on all beads in a givencontainer, the same nucleotide addition reaction occurs, The beads arethen washed of excess reagents, pooled in a single container, mixed andre-distributed into another plurality of containers in preparation forthe next round of synthesis. It should be noted that by virtue of thelarge number of beads utilized at the outset, there will similarly be alarge number of beads randomly dispersed in the container, each having aunique oligonucleotide sequence synthesized on a surface thereof afternumerous rounds of randomized addition of bases. An individual bead maybe tagged with a sequence which is unique to the double-stranded.oligonucleotide thereon, to allow for identification during use.

In yet another embodiment, a plurality of oligonucleotides may beattached or synthesized on nanoparticles. Nanoparticles includes but arenot limited to metal (e.g., gold, silver, copper and platinum),semiconductor (e.g., CdSe, CdS, and CdS coated with ZnS) and magnetic(e.g., ferromagnetite) materials. Methods to attach oligonucleotides tothe nanoparticles are known in the art. In another embodiment,nanoparticles are attached to the substrate. Nanoparticles with orwithout immobilized oligonucleotides can be attached to substrates asdescribed in, e.g., Grabar et al., Analyt. Chem., 67, 73-743 (1995);Bethell et al., J. Electroanal. Chem., 409, 137 (1996); Bar et al.,Langmuir, 12, 1172 (1996); Colvin et al., J. Am. Chern. Soc., 114, 5221(1992). Naked nanoparticles may be first attached to the substrate andoligonucleotides can be attached to the immobilized nanoparticles.

Pre-synthesized oligonucleotide and/or polynucleotide sequences may beattached to a support or synthesized in situ using light-directedmethods, flow channel and spotting methods, inkjet methods, pin-basedmethods and bead-based methods set forth in the following references:McGall et al. (1996) Proc. Natl. Acad. Sci. U.S.A. 93:13555; SyntheticDNA Arrays In Genetic Engineering, Vol. 20:111, Plenum Press (1998);Duggan et al. (1999) Nat. Genet. S21:10; Microarrays: Making Them andUsing Them In Microarray Bioinformatics, Cambridge University Press,2003; U.S. Patent Application Publication Nos. 2003/0068633 and2002/0081582; U.S. Pat. Nos. 6,833,450, 6,830,890, 6,824,866, 6,800,439,6,375,903 and 5,700,637; and PCT Publication Nos. WO 04/031399, WO04/031351, WO 04/029586, WO 03/100012, WO 03/066212, WO 03/065038, WO03/064699, WO 03/064027, WO 03/064026, WO 03/046223, WO 03/040410 and WO02/24597; the disclosures of which are incorporated. herein by referencein their entirety for all purposes. In some embodiments, pre-synthesizedoligonucleotides are attached to a support or are synthesized using aspotting methodology wherein monomers solutions are deposited dropwiseby a dispenser that moves from region to region (e.g. ink jet) in someembodiments, oligonucleotides are spotted on a support using, forexample, a mechanical wave actuated dispenser,

Microfluidic Devices and Microvolume Reactions

Provided herein are microfluidic devices for the manipulation ofdroplets on a substrate (e.g. solid support). Methods and devices forsynthesizing or amplifying oligonucleotides as well for preparing orassembling long polynucleotides having a predefined sequence areprovided herein. Aspects of the technology provided herein are usefulfor increasing the accuracy, yield, throughput, and/or cost efficiencyof nucleic acid synthesis and assembly reactions.

The manipulation of fluids to form fluid streams of desiredconfiguration, such as discontinuous quid streams, particles,dispersions, etc., for purposes of fluid delivery, product manufacture,analysis, and the like, is a relatively well-studied art. See forexample, WO/2004/002627 which is incorporated herein in its entirety. Insome aspects of the invention, microfluidic devices are used to form andmanipulate droplets in a co-planar fashion to allow oligonucleotidesynthesis. For example, oligonucleotides may be synthesized using aphophoramidite method. The phosphoramidite method, employing nucleotidesmodified with various protecting groups, is one of the most commonlyused methods for the de novo synthesis of oligonucleotides. Detailedprocedures for the phosphoramidite and hydrogen phosphona.te methods ofoligonucleotide synthesis are described in the following references thatare incorporated by reference: U.S. Pat. No 4,500,707; 4,725,677; and5,047,524. See also for example, methods outlined in Oligonucleotide andAnalogs: A practical approach, F. Eckstein, Ed. Press Oxford Universityand Oligonucleotide synthesis: A practical approach, Gait, Ed. IRLOxford Press. Synthesis can be performed either through the coupling ofthe 5′ position of the first monomer to the 3′ position of the secondmonomer (3′-5′ synthesis) or vine versa (5′-3′ synthesis). Briefly,synthesis of oligonucleotides requires the specific formation of a 3′-5′or 5′3′ phosphodiester linkage. In order to form these specificlinkages, the nucleophilic centers not involved in the linkage must bechemically protected through the use of protecting group. By “protectinggroup” as used herein is meant a species which prevents a segment of amolecule (e.g. nucleotide) from undergoing a specific chemical reaction,but which is removable from the molecule following completion of thatreaction. For example, the 5′ hydroxyl group may be protected bydimethoxitrityl (DMT). During the deblocking reaction, the DMT isremoved with an acid, such as thrichloroacetic acid (TeA) ordichloroacetic acid, resulting in a free hydroxyl group. After washing,a phosphoramidite nucleotide is activated by tetrazole,ethylthiotetrazole, dicyanoimidazole, or benzimidazolium triflate, forexample, which remove the iPr2N group on the phosphate group. Thedeprotected 5′ hydroxyl of the first base reacts with the phosphate ofthe second base and a 5′-3′ linkage is formed (coupling step). Unboundbases are washed out and 5′ hydroxyl group that did not react during thecoupling reaction are blocked by adding a capping group, whichpermanently binds to the free 5′ hydroxyl groups to prevent any furtherchemical transformation of that group (capping step). The oxidation stepmay be performed before or after the capping step. During oxidation, thephosphite linkage is stabilized to form a much more stable phosphatelinkage. The deblocking/coupling/capping/oxidation cycle may be repeatedthe requisite number of time to achieve the desired lengthpolynucleotide. In some embodiments, coupling can be synchronized on thearray or solid support.

In some embodiments, the oligonucleotides synthesis is performed using adevice that generates emulsion droplets comprising aqueous dropletswithin immiscible oil. The droplets may comprise an aqueous phase, animmiscible oil phase, and a surfactant and/or other stabilizingmolecules to maintain the integrity of the droplet. In some embodiment,mechanical energy is applied, allowing dispersion of a compound into anoil phase to form droplets, each of which contains a single sort ofcompound. Preferably, the compound is a nucleotide monomer (i.e. A, T orU, G, C). The compounds can be deposited into the oil phase in the formof droplets generated using inkjet printing technology or piezoelectricdrop-on-demand (DOD) inkjet printing technology. Each droplet maycomprise a different nucleotide monomer (i.e. A, T or U, G, C) in thesame aqueous solution. in preferred embodiments, the droplets areuniform in size and contain one nucleotide at a fixed concentration. Thedroplets range in size from 0.5 microns to 500 micron in diameter, whichcorrespond to a volume of about 1 picoliter to about 1 nanoliter. Yet inother embodiments, the droplet may comprise a 2-mer, a 3-mer, a 4-trier,a 6-mer or a 7-mer oligonucleotide. In sonic embodiments, the dropletsare deposited onto a substrate such as a microsubstrate, a microarray ora microchip. The terms microsubstrate, microarray and microchip are usedinterchangeably herein. The droplets may be deposited using amicrofluidic nozzle. In some embodiments, the substrate may be subjectedto wash, deblocking solution, coupling, capping and oxidation reactionsto elongate the oligonucleotide.

In some embodiments, the droplets carrying the nucleotides can me movedusing electrowetting technologies. Electrowetting involves modifying thesurface tension of liquids on a solid surface using a voltage.Application of an electric field (e.g. alternating or direct) can modifythe contact angle between the fluid and surfaces. For example, byapplying a voltage, the wetting properties of a hydrophobic surface canbecome increasingly hydrophilic and therefore wettable. Electrowettingprinciple is based on manipulating droplets on a surface comprising anarray of electrodes and using voltage to change the interfacial tension.In some embodiments, the array of electrodes is not in direct contactwith the fluid. In some embodiments, the array of electrodes isconfigured such as the support has a hydrophilic side and. a hydrophobicside. The droplets subjected to the voltage will move towards thehydrophilic side. In some embodiments, the array or pattern ofelectrodes is a high density pattern. One should appreciate that to beused in conjunction with the phophoramidite chemistry, the array ofelectrodes should be able to move droplets volumes ranging from 1 pL(and less) to 10 pL. Accordingly, aspects of the invention relate tohigh voltage complementary semi-conductor microfluidic controller. Insome embodiments, the high voltage complementary semi-conductor device(HV-CMOS) has an integrated circuit with high density electrode patternand high voltage electronics. In sonic embodiments, the voltage appliedis between 15V and 30V.

Methods and devices provided herein involve amplification and/or smallassembly reaction volumes such as microvolumes, nanovolumes, picovolumesor sub-picovolumes. Accordingly, aspects of the invention relate tomethods and devices for amplification and/or assembly of polynucleotidesequences in small volume droplets on separate and addressable featuresof a support. For example, a plurality of oligonucleotides complementaryto surface-bound single stranded oligonucleotides is synthesized. in apredefined. reaction microvolume of solution by template-dependantsynthesis. In some embodiments, predefined reaction microvolumes ofbetween about 0.5 pL and about 100 nL may be used. However, smaller orlarger volumes may be used. In some embodiments, a mechanical waveactuated dispenser may be used for transferring volumes of less than 100nL, less than 10 nL, less than 5 nL, less than 100 pL, less than 10 pL,or about 0.5 pL or less. In some embodiments, the mechanical waveactuated dispenser can be a piezoelectric inkjet device or an acousticliquid handler. In a preferred embodiment, a piezoelectric inkjet deviceis used and can deliver picoliter solutions in a very precise manner ona support.

In various embodiments, methods and devices are provided for processingindependently one or more plurality of oligonucleotides and/orpolypeptides in a temperature dependent manner at addressable featuresin isolated liquid volumes. In some embodiments, the method is conductedin a manner to provide a set of predefined single-strandedoligonucleotide sequences or complementary oligonucleotide sequences forfurther specified reactions or processing, One should appreciate thateach features being independently addressable, each reaction can beprocessed independently within a predefined isolated liquid volume ordroplet on a discrete feature (e.g. virtual chamber). In someembodiments, the arrays are stored dry for subsequent reactions. In apreferred embodiment, support immobilized oligonucleotides can behydrated independently with an aqueous solution. Aqueous solutionincludes, but is not limited to, water, buffer, primers, master mix,release chemicals, enzymes, or any combination thereof. Aqueous solutioncan be spotted. or jetted onto specific surface location(s)corresponding to the discrete feature(s). Subsequently, miscible as wellas non-miscible solution or aqueous gel can be deposited in the samefashion. Alternatively, a mechanical wave actuated dispenser can be usedfor transferring small volumes of fluids (e.g., picoliter orsub-picoliter). A mechanical wave actuated dispenser can be apiezoelectric inkjet device or an acoustic liquid handler. Apiezoelectric inkjet device can eject fluids by actuating apiezoelectric actuation mechanism, which forces fluid droplets to beejected. Piezoelectrics in general have good operating bandwidth and cangenerate large forces in a compact size. Some of the commerciallyavailable piezoelectric inkjet microarraying, instruments include thosefrom Perkin Elmer (Wellesley, Mass.), GeSim (Germany) and MicroFab(Piano, Tex.). Typical piezoelectric dispensers can create droplets inthe picoliter range and with coefficient of variations of 3-7%.Inkjetting technologies and devices for ejecting a plurality of fluiddroplets toward discrete features on a substrate surface for depositionthereon have been described in a number of patents such as U.S. Pat.Nos. 6,511,849; 6,514,704; 6,042,211; 5,658,802, the disclosure of eachof which is incorporated herein by reference.

In one embodiment, the fluid or solution deposition is performed usingan acoustic liquid handler or ejector. Acoustic devices are non-contactdispensing devices able to dispense small volume of fluid (e.g.picoliter to microliter), see for example Echo 550 from Labcyte (CA),HTS-01 from EDC Biosystems. Acoustic technologies and devices foracoustically ejecting a plurality of fluid droplets toward discretesites on a substrate surface for deposition thereon have been describedin a number of patents such as U.S. Pat. Nos. 6,416,164; 6,596,239;6,802,593; 6,932,097; 7,090,333 and US Patent Application 2002-0037579,the disclosure of each of which is incorporated herein by reference, Theacoustic device includes an acoustic radiation generator or transducerthat may be used to eject fluid droplets from a reservoir (e.g.microplate wells) through a coupling medium. The pressure of the focusedacoustic waves at the fluid surface creates an upwelling, therebycausing the liquid to urge upwards so as to eject a droplet, for examplefrom a well of a source plate, to a receiving plate positioned above thefluid reservoir. The volume of the droplet ejected can be determined byselecting the appropriate sound wave frequency.

In some embodiments, the source plate comprising water, buffer, primers,master mix, release chemicals, enzymes, or any combination thereof andthe destination plates comprising the oligonucleotides orpolynucleotides are matched up to allow proper delivery or spotting ofthe reagent to the proper site. The mechanical wave actuated dispensermay be coupled with a. microscope and/or a camera. to provide positionalselection of deposited spots. A camera may be placed on both sides ofthe destination plate or substrate. A camera may be used to register tothe positioning on the array especially if the DNA is coupled with afluorescent label.

One should appreciate that when manipulating small liquid volumes suchas picoliters and nanoliters, the smaller the droplet, the faster itwill evaporate. Therefore, aspects of the invention relate to methodsand devices to limit, retard or prevent water evaporation. In someembodiments, the discrete features or a subset of discrete features canbe coated with a substance capable of trapping or capturing water. Inother embodiments, the water-trapping material can be spin-coated ontothe support, Different materials or substances can be used to trap waterat specific locations. For example, the water trapping substance may bean aqueous matrix, a gel, a colloid or any suitable polymer. In someembodiments, the material is chosen to have a melting point that allowsthe material to remain solid or semi-solid (e.g. gel) at the reactiontemperatures such as denaturing temperatures or thermocyclingtemperatures. Water trapping materials include but are not limited tocolloidal silica, peptide gel, agarose, solgel and polydimethylsiloxane.In an exemplary embodiment, Snowtex® colloidal silica (Nissan Chemical)may be used. Snowtex® colloidal silica is composed of mono-dispersed,negatively charged, amorphous silica particles in water. Snowtex®colloidal silica can be applied as dry gel or as an hydrated gel ontothe surface. In a preferred embodiment, the water trapping substance isspotted at discrete features comprising surface-bound oligonucleotides.Alternatively, oligonucleotides can be synthesized on the particles ornanoparticles (e.g. colloidal particles, Snowtex® colloidal silica) andthe particles or nanoparticles can be dispensed to the discrete featuresof the surface. In some embodiments, the water trapping substance isspotted on the discrete features of the support using a mechanicaldevice, an inkjet device or an acoustic liquid handler.

One should appreciate, that evaporation can also be limited by forming aphysical barrier between the surface of the droplet and the atmosphere.For example, a non-miscible solution can be overlaid to protect thedroplet from evaporation. In some embodiments, a small volume of thenon-miscible solution is dispensed directly and selectively at discretelocation of the substrate such as features comprising a droplet. In someother embodiments, the non-miscible solution is dispensed onto a subsetof features comprising a droplet. in other embodiments, the non-misciblesolution is applied uniformly over the surface of the array forming anon-miscible bilayer in which the droplets are trapped. The non-misciblebilayer can then be evaporated to form a thin film over the surface orover a substantial part of the surface of the droplet. The non-misciblesolution includes, but is not limited to, mineral oil, vegetable oil,silicone oil, paraffin oil , natural or synthetic wax, organic solventthat is immiscible in water or any combination thereof. One skilled inthe art will appreciate that depending on the composition of the oils,some oils may partially or totally solidify at or below roomtemperature. In some embodiments, the non--miscible solution may be anatural or synthetic wax such as paraffin hydrocarbon. Paraffin is analkane hydrocarbon with the general formula C_(n)H_(2n+2). Depending onthe length of the molecule, paraffin may appear as a gas, a liquid or asolid at room temperature. Paraffin wax refers to the solids with20≤n≤40 and has a typical melting point between about 47° C. to 64° C.Accordingly, in some embodiments, the support may be stored capped witha wax. Prior to use, heat may be applied to the support low the wax turninto a liquid wax phase coating the support.

In some aspects of the invention, in subsequent steps, a solvent or anaqueous solution may be added to the droplet having a non-misciblesolution at its surface. Aqueous solution may be added, for example, toinitiate a reaction, to adjust a volume, to adjust a pH, to increase ordecrease a solute concentration, etc. One would appreciate that theaqueous solution can penetrate the non-miscible layer using differentmechanisms. For example, if using an inkjet head device, the aqueoussolution is ejected and. the physical momentum of the ejected dropletwill enable the aqueous solution to cross the non-miscible layer. Othermechanisms may employ additional threes, such as for example magneticand/or electrostatic forces and/or optical threes. The optical andmagnetic forces can be created simultaneously or independently of oneanother. Furthermore, the mechanism can utilize coupled magneto-opticaltweezers. In some embodiments, the aqueous solution to be dispensedcontains magnetic nanoparticles and a magnetic force can be used to helppenetration of the non-miscible layer. Alternatively, the aqueoussolution carries an electrostatic charge and. an external appliedelectric field can be used to achieve penetration of the non-misciblelayer.

Yet, in another aspect of the invention, the size of the droplet iscontinuously or frequently monitored. One should appreciate that thesize of the droplet is determined by the volume and by the surfacetension of the solution. Accordingly, loss of volume can be detected bya decrease of the droplet footprint or radius of the droplet footprint.For example, using an optical monitoring system, through a microscopelens and camera system, the size or footprint of the droplet can bedetermined and the volume of the droplet can be calculated. In someembodiments, the volume of the droplet or the radius of the droplet ismonitored every second or every millisecond. One would appreciate thatthe magnitude of the evaporation rate of the solvent (e.g. water) fromthe droplet of interest depends in part on the temperature and thusincreases with increasing temperatures. For example, duringamplification by thermocycling or during denaturation of thedouble-stranded complexes, increase of temperature will result in therapid evaporation of the droplet. Therefore, the volume of the dropletcan be monitored more frequently and the droplet volume can be adjustedby re-hydration more frequently. in the event of volume fluctuation suchas loss of volume, sub-pica.) to nano volumes of solvent (e.g. water)can be dispensed onto the droplet or to the discrete feature comprisingthe droplet. Solvent or water volumes of about 0.5 pL, of about 1 pL, ofabout 10 pL, of about 100 pL, of about 1 nL, of about 10 nL, of about100 nL can be dispensed this way, Solvent or water volumes may bedelivered by any conventional delivery means as long that the volumesare controlled and accurate, In a preferred embodiment, water isdispensed using an inkjet device. For example, a typical inkjet printeris capable of producing droplets volumes ranging from about 1.5 pL toabout 10 pL, while other commercial ultrasonic dispensing techniques canproduce droplets volumes of about 0.6 pL. In some embodiments, water isadded in a rapid series of droplets. In some embodiments, water isdispensed when registering a loss of volume of more than 10%, of morethan 25%, of more than 35%, of more than 50%.

In another embodiment, the evaporation rate can be limited by adding acompound having a high boiling point component to the droplet(s) ofinterest, provided that the presence of the compound does not inhibitthe enzymatic reactions performed on the substrate. The boiling point ofa liquid is the temperature at which the liquid and vapor phases are inequilibrium with each other at a specified pressure. When heat isapplied to a solution, the temperature of the solution rises until thevapor pressure of the liquid equals the pressure of the surroundinggases. At this point, vaporization or evaporation occurs at the surfaceof the solution. By adding a high boiling point liquid to the droplet ofinterest, evaporation of the water content of a droplet may besubstantially reduced (see U.S. Pat. No. 6,177,558). In someembodiments, the high boiling point solution is a solvent. In someembodiments, the high boiling point liquid has a boiling point of atleast 100° C., at least 150° C., at least 200° C. In some embodiments,glycerol is added to the solution, increasing the boiling point,Accordingly, the solution containing the high boiling point liquid willevaporate at a. much slower rate at room temperature or at reactionconditions such as thermocycling, extension, ligation and denaturation.

In other embodiments, evaporation rate is limited by raising the vaporrate or humidity surrounding the droplet. This can be performed, forexample, by placing “sacrificial” droplets around or in close proximityto the droplet of interest (e.g. around or in close proximity of adroplet comprising the oligonucleotides) (see for example, Berthier E.et al., Lab Chip, 2008, 8(6):852-859).

Some aspects of the invention relate to devices to control the humidityand/or the evaporation rate. In some embodiments, the surface or solidsupport is enclosed in a closed container to limit the evaporation. Forexample, humidity control can be achieved on a microvolume sealed plate.A substrate or support can be provided, the substrate having definedfeatures such that volumes of reactions can be deposited in suchfeatures via, for example, an inkjet and inkjet-like liquid dispensingtechnology. A lid cart be used to seal such reaction volumes by eitherapplied pressure or using a lid with a pressure-sensitive adhesive onthe contacting side to the substrate. In some embodiments, the densityof these features can be at least 10 features per cm², at least 100features per cm², at least 1,000 features per cm², at least 10,000features per cm², at least 100,000 features per cm², at least 1,000,000features per cm². The features can have diameter (width and length)dimensions from less than about 1 cm, less than about 1 mm, less thanabout 1 μm. The depth of the features can have dimensions from less thanabout 1 cm, less than about 1 mm, less than about 1 μm. The width,length, and depths of the features can differ front feature to feature.The features geometry can be complex, including lines, spirals, bends(at all possible angles from 0.01 degrees to 179.99 degrees) or anycombination of such complex geometries.

In another embodiment, the substrate is flat with reaction volumes (e.g.droplets) set up on a surface of the substrate. The lid is designed tohave features that form containers or vessels for reaction volumes. Thereaction volumes can be created on the substrate using inkjet andinkjet-like liquid manipulation technologies. The lid can be sealedagainst the substrate by either applied pressure or using a lid orsubstrate with a pressure-sensitive adhesive on the contacting side tothe substrate.

Aspects of the invention relate to feedback controlled humidity devices,systems and methods. In some embodiments, the device comprises aconfinement chamber structure. A volume of a mixture of different gases,such oxygen, nitrogen, argon, helium, water vapor, solvent vapor, andany other desirable gases, can be maintained inside a confinementchamber structure consisting of walls. In some embodiments, openings onthe wall allow introduction and removal of different components of thegases to achieve the desired composition of the gas mixture in thevolume. Additional openings can be used to serve as measurement orsampling ports to examine the condition or composition of the gas in thevolume. A substrate carrying small reaction volumes (e.g. droplets)deposited by, for example, an inkjet or inkjet-like liquid dispensingtechnology can be placed inside the chamber's volume.

The chamber's volume can be further confined by a lid. In someembodiments, the lid is temperature controlled. The lid can be made of amaterial that is optically transparent, such as glass. The heating ofthe lid can be accomplished via an electrically conductive layer ofindium in Oxide (ITO), and heated via Ohmic heating. Other heating orcooling methods are also possible, for example, via forced fluid flow.The chamber's volume can be further confined by a bottom. In sonicembodiment, the bottom is temperature controlled. In some embodiments,the volume is modulated to contain all environment that has the exactmolar ratio of different gas mixtures In a preferred embodiment, themolar ratio of water vapor and carrier gas (air, helium, argon,nitrogen, or any other desirable gas, including solvent vapors) can becontrolled, together with the temperature of the volume, to allow anequilibrium between water or solvent evaporation and condensation on thesurface of the substrate. This equilibrium allows the reaction volumeson the substrate to be maintained at the desirable steady volume over anappropriate period of time. The appropriate period of time can be in therange of seconds, minutes, hours or days. One skilled in the art wouldappreciate that the droplets volumes can be maintained, decreased orincreased by controlling evaporation and/or condensation. For exampleevaporation of the reaction volumes on the substrate is induced to, forexample, achieve and/or control sample concentration and/ or decreasethe reaction volumes. Yet in other embodiments, condensation of thereaction volumes on the substrate is induced to, for example, achieveand/or control sample dilution and/or increase reaction volumes. Onewould appreciate that it is important to control condensation whenincreasing reaction volumes, In some embodiments, condensation can becontrolled by periodic humidity compensation. For example, by increasingthe temperature on the substrate and/or lowering the humidity in thechamber, evaporation can be induced over a short period of time (in therange of ms, s or min), The evaporation of small satellite droplets(e.g. off target droplets) will take place before evaporation of largerdroplets (e.g. reaction volumes). Since the evaporation rate (by volume)is proportional to droplets' surface areas, and smaller droplets havinghigher surface-to-volume ratio evaporate first. In other embodiments,condensation may be controlled by controlling substrate's surfaceproperties such as hydrophilicity/hydrophobicity. One skilled in the artwill appreciate that condensation or droplet growth is characterized bynucleation of the droplet at nucleation sites. The rate of nucleation isa function of the surface tension and the wetting angle. Accordingly,surfaces promoting nucleation have a wetting -contact angles greaterthan zero. In some embodiments, condensation can be controlled bydesigning off-target areas on the surface (such as interfeatures) havingsurface properties impairing nucleation. For example, the substrate'ssurface can be treated so that off-target areas are more hydrophobicthan areas where droplet growth is desired. In other embodiments, theoff-target areas surfaces are designed to be smooth so that nonucleation is reduced.

In some aspects of the invention, the reaction volumes are controlledvia a feedback control. In some embodiments, one or more monitoringisolated volumes (e.g. monitoring droplets) are used to monitor aplurality of isolated reaction volumes (e.g. droplets comprisingpredefined oligonucleotide sequences) on a support. In some embodiments,a first support is provided which comprises a plurality of reactiondroplets and a second support is provided comprising at least onemonitoring droplet. Preferably, the at least one monitoring droplet hasan identical surface-to-volume ratio than at least one of the reactiondroplet of interest and an identical solvent composition. Accordingly,modification of the reaction volume of the monitoring droplet isindicative of the modification of volume of the at least one droplet ofinterest. In some embodiments, the reaction droplets and monitoringdroplets are placed on the same support.

In some embodiments, the molar ratio of the mixture of gases is measuredusing a cold mirror setup. An optically reflective surface can be placedon the bottom surface, next to the substrate. The mirror can be of asimilar material and similar thickness to the substrate to best mimicthe thermal behavior of the substrate. The reflective surface on themirror can be on the top surface or on the bottom surface. In someembodiments, the mirror is placed on the same substrate as the reactionvolumes. In some embodiments areas on the substrate can be made to actas mirrors to provide multiple measurement locations on the substrate.An optical assembly, consisting of a ,source, optical train, and adetector can be placed outside of the chamber's volume. In someembodiments, measure of the fogging or condensation of water fine waterdroplets onto a mirror is used to measure the condensation or theevaporation rate. Condensation condition on the mirror can be detectedby measuring the intensity of the optical beam reflected off thesurface. The beam can be modulated in time and wavelength, via themodulation of the source to achieve higher signal to noise ratios. Inpreferred embodiments, the system comprises a control loop. In anexemplary embodiment, the control loop includes a detector which feedsmeasured optical intensities to a signal conditioning circuit. Theoutput of the signal conditioning circuit is used by the cold mirrorlogic to determine the condensation state on the mirror, and calculatemolar ratio of the mix of gases in the volume using other inputs such astemperature, pressure etc. The system comprises a temperature sensor,pressure sensor, and/or any other suitable sensors. The humidity andtemperature logic determines the actuation of humidifier, dehumidifier,and/or temperature controllers to effectuate the desirable conditionsdetermined by the cold mirror logic.

in some embodiments, the reaction volumes contain the necessary reagentsto allow enzyme mediated biochemical reactions to take place between themolecular population inside the reaction volume (e.g. droplet) and themolecular population present on the wetted surface in contact with thereaction volume, One would appreciate that the reaction volume can beused to carry out a variety of reactions including, but no limited to,amplification, hybridization, extension, ligation, sequencing, in-vitrotranscription, in-vitro translation, or any other reaction of interest.The molecular population may contain nucleic acids, DNA, RNA,oligonucleotides, proteins, dNTPs, salts, buffer components, detergents,and/or any other appropriate component. The reaction volume maycomprises an enzyme, such as a polymerase, a ligase, a CEL1-likeendonuclease, a nuclease, mixtures of such enzymes, and/or any otherappropriate enzymes. In some embodiments, the products of the enzymemediated biochemical reaction can include contain nucleic acids, DNA,RNA, oligonucleotides, proteins, labeled nucleic acids, amplifiednucleic acid (e.g. clonal amplification of a selected population ofnucleic acid), assembled nucleic acids etc.

In some aspects of the invention, the reagents in the reaction volumespromote oligonucleotide or polynucleotide assembly. In some embodiments,the reaction volumes may contain two or more populations ofsingle-stranded oligonucleotides having predefined sequences insolution, The populations of oligonucleotides can hybridize to asingle-stranded oligonucleotide attached to the wetted surface therebyforming double-stranded hybrids or duplexes attached to the surface. Insome embodiments, the double-stranded hybrids contain breaks and gaps inthe phosphodiester backbone, formed at the junctions of differentoligonucleotide populations. In some embodiments, a polymerase and dNTPsand other necessary components are added to fill the gaps in thebackbone. In other embodiments, a ligase and other necessary componentsare added to mend. breaks in the backbone,

In other embodiments, the reaction volumes may contain two or morepopulations of oligonucleotides in solution, each population ofoligonucleotide having predefined sequence. In some embodiments, eachpopulation of oligonucleotide has a sequence complementary to the ananother population of oligonucleotides. In this manner, the populationsof oligonucleotides can hybridize to form double stranded hybrids orduplexes in solution. The hybrids may contain breaks and gaps in thephosphodiester backbone, formed at the junctions of differentoligonucleotide populations. In some embodiments, a polymerase and dNTPsand other necessary components are added to fill the gaps in thebackbone. In other embodiments, a ligase and other necessary componentsare added to mend breaks in the backbone.

Amplification Reactions

Aspects of the invention provide methods for the amplification of one ormore single-stranded oligonucleotide on the support. Oligonucleotidesmay be amplified before or after being detached from the support and/oreluted in a droplet. Preferably, the oligonucleotides are amplified onthe solid support. One skilled in the art will appreciate thatoligonucleotides that are synthesized on solid support will comprise aphosphorylated 3′ end or an additional 3′-terminal nucleoside (e.g,, The3′-phosphorylated oligonucleotides are not suitable for polynucleotideassembly as the oligonucleotides cannot be extended by polymerase. Inpreferred aspects of the invention, the oligonucleotides are firstamplified and the amplified products are assembled into apolynucleotide. Accordingly, aspect of the invention provides methodswherein a set or subset of oligonucleotides, that are attached to at aset of subset of features of the support, are amplified by locallydelivering sub-microvolumes at addressable discrete features. The term“amplification” means that the number of copies of a nucleic acidfragment is increased. As noted above, the oligonucleotides may be firstsynthesized onto discrete features of the surface, may be deposited onthe substrate or may be deposited on the substrate attached tonanoparticles. In a preferred embodiment, the oligonucleotides arecovalently attached to the surface or to nanoparticles deposited on thesurface. In an exemplary embodiment, locations or features comprisingthe oligonucleotides to be amplified are first selected. In a preferredembodiment, the selected features are in close proximity to each otherson the support. Aqueous solution is then deposited on the selectedfeature thereby forming a droplet comprising hydrated oligonucleotides.One would appreciate that each droplet is separated from the other bysurface tension. In some embodiments. the solution can be water, bufferor a solution promoting enzymatic reactions. In an exemplary embodiment,the solution includes, but is not limited to, a solution promotingprimer extension. For example, the solution may be composed ofoligonucleotides primer(s), nucleotides (dNTPs), buffer, polymerase andcofactors. In other embodiments, the solution is an alkaline denaturingsolution. Yet, in other embodiments, the solution may compriseoligonucleotides such as complementary oligonucleotides.

In some embodiments, oligonucleotides or polynucleotides are amplifiedwithin the droplet by solid phase PCR thereby eluting the amplifiedsequences into the droplet, In other embodiments, oligonucleotides orpolynucleotides are first detached form the solid support and thenamplified. For example, covalently-attached oligonucleotides aretranslated into surface supported DNA molecules through a process ofgaseous cleavage using amine gas. Oligonucleotides can be cleaved withammonia, or other amines, in the gas phase whereby the reagent gas comesinto contact with the oligonucleotide while attached to, or in proximityto, the solid support (see Boal et al., Nucl. Acids Res, 1996,24(15):3115-7), U.S. Pat, Nos. 5,514,789; 5,738,829 and 6,664,388). Inthis process, the covalent bond attaching the oligonucleotides to thesolid support is cleaved by exposing the solid support to the amine gasunder elevated pressure and/or temperature. In some embodiments, thisprocess may be used to “thin” the density of oligonucleotides atspecific features.

One would appreciate that amplification occurs only on featurescomprising hydrated template oligonucleotides (i.e. local amplificationat features comprising a droplet volume). Different set of features maybe amplified in a parallel or sequential fashion with parallel orsequential rounds of hydrating (i.e. dispensing a droplet volume on aspecific feature), amplifying oligonucleotides and drying the set offeatures. hi some embodiments, the support is dried by evaporatingliquid in a vacuum while heating. Thus, after each round ofamplification, the support will comprise a set of droplets containingoligonucleotides duplexes. The complementary oligonucleotides can bereleased in solution within the droplet and be collected. Alternatively,complementary oligonucleotides may be dried onto the discrete featuresfor storage or further processing. Yet, complementary oligonucleotidescan be subjected to further reactions such as error filtration and/orassembly, In some embodiments, a different set or subset of features canthen be hydrated and a different set or subset of templateoligonucleotides can be amplified as described herein. In the case ofthe enzymatic amplification, the solution includes but is not limited toprimers, nucleotides, buffers, cofactors, and enzyme. For example, anamplification reaction includes DNA polymerase, nucleotides (e.g. dATP,dCTP, dTTP, dGTP), primers and buffer.

According to sonic aspects of the invention, hydrated oligonucleotidescan be amplified within the droplet, the droplet acting as a virtualreaction chamber. In some embodiments, the entire support or arraycontaining the discrete features is subjected to amplification. In otherembodiments, one or more discrete features are subjected toamplification. Amplification of selected independent features (beingseparated from each others) can be performed by locally heating at leastone discrete feature. Discrete features may be locally heated by anymeans known in the art. For example, the discrete features may belocally heated using a laser source of energy that can be controlled ina precise x-y dimension thereby individually modulating the temperatureof a droplet. In another example, the combination of a broader beamlaser with a mask can be used to irradiate specific features. In someembodiments, methods to control temperature on the support so thatenzymatic reactions can take place on a support (PCR, ligation or anyother temperature sensitive reaction) are provided. In some embodiments,a scanning laser is used to control the thermocycling on distinctfeatures on the solid support. The wavelength used can be chosen fromwide spectrum (100 nm to 100,000 nm, i.e. from ultraviolet to infrared).In some embodiments, the feature on which the droplet is spottedcomprises an optical absorber or indicator. In some other embodiments,optical absorbent material can be added on the surface of the droplet.In some embodiments, the solid support is cooled by circulation of airor fluid. The energy to be deposited can be calculated based on theabsorbance behavior. In some embodiments, the temperature of the dropletcan be modeled using thermodynamics. The temperature can be measured byan LCD like material or any other in-situ technology. In someembodiments, the solid support is cooled by circulation of air or fluid.For example, the whole support can be heated and cooled down to allowenzymatic reactions to take place.

In some embodiments, a selected set of features may be protected fromhydration by using an immiscible fluid system as described above. Animmiscible fluid system, such as oil and aqueous reagents, can be usedto achieve passivation of sites on which reactions take place.

In some embodiments, the oligonucleotides may comprise universal (commonto all oligonucleotides), semi-universal (common to at least of portionof the oligonucleotides) or individual or unique primer (specific toeach oligonucleotide) binding sites on either the 5′ end or the 3′ endor both. As used herein, the term “universal” primer or pruner bindingsite means that a sequence used to amplify the oligonucleotide is commonto all oligonucleotides such that all such oligonucleotides can beamplified using a single set of universal primers. In othercircumstances, an oligonucleotide contains a unique primer binding site.As used herein, the term “unique primer binding site” refers to a set ofprinter recognition sequences that selectively amplifies a subset ofoligonucleotides. In yet other circumstances, an oligonucleotidecontains both universal and unique amplification sequences, which canoptionally be used sequentially.

In some embodiments, primers/primer binding site may be designed to betemporary. For example, temporary primers may be removed by chemical,light based or enzymatic cleavage. For example, primers/primer bindingsites may be designed to include a restriction endonuclease cleavagesite. In an exemplary embodiment, a primer/primer binding site containsa binding and/or cleavage site for a type IIs restriction endonuclease.In such case, amplification sequences may be designed so that once adesired set of oligonucleotides is amplified to a sufficient amount, itcan then be cleaved by the use of an appropriate type IIs restrictionenzyme that recognizes an internal type IIs restriction enzyme sequenceof the oligonucleotide, In some embodiments, after amplification, thepool of nucleic acids may be contacted with one or more endonucleases toproduce double-stranded breaks thereby removing the primers/primerbinding sites. In certain embodiments, the forward and reverse primersmay be removed by the same or different restriction endonucleases.

Any type of restriction endonuclease may be used to remove theprimers/primer binding sites from nucleic acid sequences. A wide varietyof restriction endonucleases having specific binding and/or cleavagesites are commercially available, fir example, from New England Biolabs(Beverly, Mass.). In various embodiments, restriction endonucleases thatproduce 3′ overhangs, 5′ overhangs or blunt ends may be used. When usinga restriction endonuclease that produces an overhang, an exonuclease(e.g., RecJf, Exonuclease I, Exonuclease T, S1 nuclease, P1 nuclease,mung bean nuclease, T4 DNA polyrnerase, CEL I nuclease, etc.) may beused to produce blunt ends. Alternatively, the sticky ends formed by thespecific restriction endonuclease may be used to facilitate assembly ofsubassemblies in a desired arrangement. In an exemplary embodiment, aprimer/primer binding site that contains a binding and/or cleavage sitefor a type IIs restriction endonuclease may be used to remove thetemporary primer. The term “type-IIs restriction endonuclease” refers toa restriction endonuclease having a non-palindromic recognition sequenceand a cleavage site that occurs outside of the recognition site (e.g.,from 0 to about 20 nucleotides distal to the recognition site). Type IIsrestriction endonucleases may create a nick in a double-stranded nucleicacid molecule or may create a double-stranded break that produces eitherblunt or sticky ends (e.g., either 5′ or 3′ overhangs). Examples of TypeIIs endonucleases include, for example, enzymes that produce a 3′overhang, such as, for example, Bsr I, Bsm 1, BstF5 I, BsrD I, Bts I,Mn1 I, BciV I, Hph I, Mbo II, Eci I, Acu. I, Bpm I, Mme I, BsaX I, BegI, Bae I, Bfi I, TspDT I, TspGW I, Taq II, Eco57 I, Eco57M I, Gsu I, PpiI, and Psr I; enzymes that produce a 5′ overhang such as, for example,BsmA I, Ple I, Fau I, Sap I, BspM I, SfaN I, Hga I, Bvb I, Fok I, BceAI, BsmF I, Ksp632 I, Eco31 I, Esp3 Aar I; and enzymes that produce ablunt end, such as, for example, Mly I and Btr I. Type-IIs endonucleasesare commercially available and are well known in the art (New EnglandBiolabs, Beverly, Mass.).

After amplification, the polymerase may be deactivated to preventinterference with the subsequent steps. A heating step (e.g. hightemperature) can denature and deactivate most enzymes which are notthermally stable. Enzymes may be deactivated in presence (e.g. withinthe droplet) or in the absence of liquid (e.g. dry array). Heatdeactivation on a dry support has the advantage to deactivate theenzymes without any detrimental effect on the oligonucleotides. In someembodiments, a non-thermal stable version of the thermally stable PCRDNA Polymerase may be used, although the enzyme is less optimized forerror rate and speed. Alternatively, Epoxy dATP can be use to inactivatethe enzyme.

The complementary oligonucleotides produced by amplification can bereleased in solution within the droplet by way of stringent melt. Theconditions for stringent melt (e.g., a precise melting temperature) canbe determined by observing a real-time melt curve, In an exemplary meltcurve analysis, PCR products are slowly heated in the presence ofdouble-stranded DNA (dsDNA) specific fluorescent dyes (e.g., SYBR Green,LCGreen, SYTO9 or EvaGreen). With increasing temperature the dsDNAdenatures (melts), releasing the fluorescent dye with a resultantdecrease in the fluorescent signal, The temperature at which dsDNA meltsis determined by factors such as nucleotide sequence, DNA length and.GC/AT ratio. Typically, G-C base pairs in a duplex are estimated tocontribute about 3° C. to the Tm, while A-T base pairs are estimated tocontribute about 2° C., up to a theoretical maximum of about 80-100° C.However, more sophisticated models of Tm are available and may be inwhich G-C stacking interactions, solvent effects, the desired assaytemperature and the like are taken into account. Melt curve analysis candetect a single base difference. Various methods for accuratetemperature control at individual features can be used as disclosedherein.

One method to control the temperature of the surface droplets is byusing a scanning optical energy deposition setup. For example, a DigitalMicromirror Device (DMD) can be used for temperature control. DMD is anoptical semiconductor. See, for example, US Patent No. 7,498,176, Insome embodiments, a DMD can be used to precisely heat selected featuresor droplets on the solid support. The DMD can be a chip having on itssurface several hundred thousand microscopic mirrors arranged in arectangular array which correspond to the features or droplets to beheated. The mirrors can be individually rotated (e.g., ±10-12°), to anon or off state. In the on state, light from a light source (e.g., abulb) is reflected onto the solid support to heat the selected spots ordroplets. in the off state, the light is directed elsewhere (e.g., ontoa heatsink). In one example, the DMD can consist of a 1024×768 array of16 μm wide micromirrors. These mirrors can be individually addressableand can be used to create any given pattern or arrangement in heatingdifferent features on the solid support. The features can also be heatedto different temperatures, e.g., by providing different wavelengths forindividual spots, and/or controlling time of irradiation.

In some embodiments, the entire support or array containing the discretefeatures is heated to a denaturing temperature. Preferably, denaturationof double stranded nucleic acid is performed in solution (e.g. withinthe droplet). During the heat denaturation step, the temperature of thesupport is raised to a stringent melt temperature or to a denaturingtemperature (95° C. to 100° C.). Elevating the temperature of thesupport to a denaturing or stringent melt temperature allows thehomoduplexes to dissociate into single strands before completeevaporation of the droplet volume. Heating the substrate results in thedenaturation and evaporation of the solution, resulting in driedsingle-stranded oligonucleotides onto the discrete features. At thispoint, the entire support may be cooled down to a predefinedhybridization or annealing temperature, A set of selected features orthe totality of the features may be re-hydrated by addition of theappropriate annealing buffer (at the appropriate annealing temperature)at the selected features or on the entire support. Single strandedoligonucleotides may then be resuspended and allowed to diffuse and tohybridize or anneal to form the double-stranded oligonucleotides(homoduplexes or heteroduplexes).

Accordingly, some aspects of the invention relate to the recognition andlocal removal of double-stranded oligonucleotides containing sequencemismatch errors at specific features. In one preferred embodiment of theinvention, mismatch recognition can be used to control the errorsgenerated during oligonucleotide synthesis, gene assembly, and theconstruction of longer polynucleotides. After amplification, thetotality of the features or a set of the features comprisingoligonucleotide duplexes are first subjected to round(s) of melting andannealing as described above. Subsequently, a first set of discretefeatures comprising oligonucleotides having same theoretical Tm arehydrated and oligonucleotides are allowed to anneal under annealingconditions. Hydrated features are then subjected to a first stringentmelt condition (condition 1). It would be appreciate that for sequentiallocal error removal, it is preferable to first start with the stringentmelt conditions corresponding to the lowest Tm (Tm(1)) and conclude withstringent melt conditions corresponding to the higher Tm (Tm(n)). Inother embodiments, the totality of the features of the support may behydrated and subjected to the lowest Tm temperature. Under the firstspecific stringent melt conditions Tm(1), only the oligonucleotides thatare hybridized in an unstable duplex will de-hybridize. De-hybridizedoligonucleotides may be removed for example, using a vacuum or may bewashed away. In a subsequent step, the support may be dried out and asecond discrete features comprising oligonucleotides having a Tm higherthan Tm(1) (for example (Tm(2)) are selectively rehydrated and allowedto anneal under annealing conditions, In other embodiments, the totalityof the features of the support may be re-hydrated and subjected to thesecond Tm temperature Tm(2) wherein Tm(2) is higher than Tm(1). Thesesteps of selective hydration, annealing, stringent melt and removal oferror-containing oligonucleotides can be repeated multiple times untilall discrete features have been subjected to the appropriate stringentmelt condition (theoretically 80-100° C.). Alternatively, a mismatchdetecting endonuclease may be added to the droplet solution. In anexemplary embodiment, a Surveyor™ Nuclease (Transgenomic Inc.) may beadded to the hydrated feature containing the oligonucleotide duplexes.Surveyor™ Nuclease is a mismatch specific endonuclease that cleaves alltypes of mismatches such as single nucleotide polymorphisms, smallinsertions or deletions. Addition of the endonuclease results in thecleavage of the double-stranded oligonucleotides at the site of themismatch. The remaining portion of the oligonucleotide duplexes can thenbe melted at a lower and less stringent temperature (e.g. stringentmelt) needed to distinguish a single base mismatch. One wo appreciatethat the error removal steps as well as the amplification steps may berepeated in a sequential fashion or in a highly parallel fashion bycontrolling the temperature of the entire support or of the independentfeatures as described above.

One skilled in the art will appreciate that releasing oligonucleotidesfrom the solid support can be achieved by a number of differenttechniques which will depend on the technique used to attach orsynthesize the oligonucleotides on the solid support. Preferably, theoligonucleotides are attached or synthesized via a linker molecule andsubsequently detached and released. In some embodiments, a plurality ofoligonucleotides may be attached or synthesized to the support, cleavedat a cleavable linker site and released in solution. For example, U.S.Pat. No. 7,563,600 discloses a cleavable linker having a succinatemoiety bound to a nucleotide moiety such that the cleavage produces a3′-hydroxy-nucleotide. The succinate moiety is bound to solid supportthrough an ester linkage by reacting the succinate moieties with thehydroxyl on the solid support. US Patent application discloses sulfonylcleavable linkers comprising a linker hydroxyl moiety and a base-labilecleaving moiety. A phosphorous-oxygen bond is formed between phosphorousof the sulfonyl amidite moieties and oxygen of the hydroxyl groups atknown location of the support. In some embodiments, the oligonucleotidesare attached or synthesized using a photo-labile linker (see for exampleTosquellas et al., Nucl. Acids Res., 1998, Vol. 26, pp 2069-2074). Insome instances, the photolabile linker can be rendered labile byactivation under an appropriate chemical treatment. For example, U.S.Pat. No. 7,183,406 discloses a safety-catch linker which is stable underthe oligonucleotide synthesis conditions and that is photolabile aftertreatment with trifluoroacetic acid. Oligonucleotides linked with aphoto-labile linker can then be released by photolysis. Usingphotolabile linkers, it is therefore possible to selectively release insolution (e.g. in a droplet) specific oligonucleotides at predeterminedfeatures. The oligonucleotides released in solution may then be broughtinto contact for further processing (hybridization, extension, assembly,etc. . . . ) by merging droplets of moving the oligonucleotides from onefeature to a next feature on a solid support.

One skilled in the art will appreciate that DNA microarrays can havevery high density of oligonucleotides on the surface (approximately 108molecules per feature), which can generate steric hindrance topolymerases needed for PCR. Theoretically, the oligonucleotides aregenerally spaced apart by about 2 nm to about 6 nm. For polymerases, atypical 6-subunit enzyme can have a diameter of about 12 nm. Thereforethe support may need to be custom treated to address the surface densityissue such that the spacing of surface-attached oligonucleotides canaccommodate the physical dimension of the enzyme. For example, a subsetof the oligonucleotides can be chemically or enzymatically cleaved, orphysically removed from the microarray. Other methods can also be usedto modify the oligonucleotides such that when primers are applied andannealed to the oligonucleotides, at least some 3′ hydroxyl groups ofthe primers (start of DNA synthesis) are accessible by polymerase. Thenumber of accessible 3′ hydroxyl groups per spot can be stochastic orfixed. For example, the primers, once annealed, can be treated to removesome active 3′ hydroxyl groups, leaving a stochastic number of 3′hydroxyl groups that can be subject to chain extension reactions. Inanother example, a large linker molecule (e.g., a concatamer) can beused such that one and only one start of synthesis is available perspot, or in a subset of the oligonucleotides per spot.

Nucleic Acid Assembly

In some embodiments, methods of assembling libraries containing nucleicacids having predetermined sequence variations are provided herein.Assembly strategies provided herein can be used to generate very largelibraries representative of many different nucleic acid sequences ofinterest. In some embodiments, libraries of nucleic acid are librariesof sequence variants. Sequence variants may be variants of a singlenaturally-occurring protein encoding sequence, However, in someembodiments, sequence variants may be variants of a plurality ofdifferent protein-encoding sequences.

In some embodiments, the assembly procedure may include several paralleland/or sequential reaction steps in which a plurality of differentnucleic acids or oligonucleotides are synthesized or immobilized,amplified, and are combined in order to be assembled (e.g., by extensionor ligation as described herein) to generate a longer nucleic acidproduct to be used for further assembly, cloning, or other applications(see U.S. provisional application 61/235,677 and PCT applicationPCT/US09/55267 which are incorporate herein by reference in theirentirety). Amplification and assembly strategies provided herein can beused to generate very large libraries representative of many differentnucleic acid sequences of interest.

Accordingly, one aspect of the technology provided herein relates to thedesign of assembly strategies for preparing precise high-density nucleicacid libraries. Another aspect of the technology provided herein relatesto assembling precise high-density nucleic acid libraries, Aspects ofthe technology provided herein also provide precise high-density nucleicacid libraries. A high-density nucleic acid library may include morethat 100 different sequence variants (e.g., e.g., about 10² to 10³;about 10³ to 10⁴; about 10⁴ to 10⁵; about 10⁵ to 10⁶; about 10⁶ to 10⁷;about 10⁷ to 10⁸; about 10⁸ to 10⁹; about 10⁹ to 10¹⁰; about 10¹⁰ to10¹¹; about 10¹¹ to 10¹²; about 10¹² to 10¹³; about 10¹³ to 10¹⁴; about10¹⁴ to 10¹⁵; or more different sequences) wherein a high percentage ofthe different sequences are specified sequences as opposed to randomsequences (e.g., more than about 50%, more than about 60%, more thanabout 70%, more than about 75%, more than about 80%, more than about85%, more than about 90%, more than about 91%, more than about 92%, morethan about 93%, more than about 94%, more than about 95%, more thanabout 96%, more than about 97%, more than about 98%, more than about99%, or more of the sequences are predetermined sequences of interest).

In some embodiments, the methods and devices provided herein useoligonucleotides that are immobilized on a surface or substrate (e.g.,support-bound oligonucleotides). Support-bound oligonucleotides comprisefor example, oligonucleotides complementary to constructionoligonucleotides, anchor oligonucleotides and/or spaceroligonucleotides.

Some aspects of the invention relate to a polynucleotide assemblyprocess wherein synthetic oligonucleotides are designed and used astemplates for primer extension reactions, synthesis of complementaryoligonucleotides and to assemble polynucleotides into longerpolynucleotides constructs. In some embodiments, the method includessynthesizing a plurality of oligonucleotides or polynucleotides in achain extension reaction using a first plurality of single-strandedoligonucleotides as templates, As noted above, the oligonucleotides maybe first synthesized onto a plurality of discrete features of thesurface, or may be deposited on the plurality of features of thesupport. The support may comprise at least 100, at least 1,000, at least10⁴, at least 10⁵, at least 10⁶, at least 10⁷, at least 10⁸ features. Ina preferred embodiment, the oligonucleotides are covalently attached tothe support. In preferred embodiments, the pluralities ofoligonucleotides are immobilized to a solid surface. In a preferredembodiment, each feature of the solid surface comprises a high densityof oligonucleotides having a different predetermined sequence (e.g.,approximately 10⁶-10⁸ molecules per feature).

In some embodiments, pluralities of different single-strandedoligonucleotides are immobilized at different features of a solidsupport. In some embodiments, the support-bound oligonucleotides may beattached through their 5′ end. In a preferred embodiment, thesupport-bound oligonucleotides are attached through their 3′ end. Insome embodiments, the support-bound oligonucleotides may be immobilizedon the support via a nucleotide sequence (e.g. degenerate bindingsequence), linker or spacer (e.g. photocleavable linker or chemicallinker). It should be appreciated that by 3′ end, it is meant thesequence downstream to the 5′ end and by 5′ end it is meant the sequenceupstream to the 3′ end. For example, an oligonucleotide may beimmobilized on the support via a nucleotide sequence, linker or spacerthat is not involved in hybridization. The 3′ end sequence of thesupport-bound oligonucleotide referred then to a sequence upstream tothe linker or spacer,

In certain embodiments, oligonucleotides may be designed to have asequence that is identical or complementary to a different portion ofthe sequence of a predetermined target polynucleotide that is to beassembled. Accordingly, in some embodiments, each oligonucleotide mayhave a sequence that is identical or complementary to a portion of oneof the two strands of a double-stranded target nucleic acid. As usedherein, the term “complementary” refers to the capacity for precisepairing between two nucleotides. For example, if a nucleotide at a givenposition of a nucleic acid is capable of hydrogen bonding with anucleotide of another nucleic acid, then the two nucleic acids areconsidered to be complementary to one another at that position.Complementarity between two single-stranded nucleic acid molecules maybe “partial,” in which only some of the nucleotides bind, or it may becomplete when total complementarity exists between the single-strandedmolecules.

In some embodiments, the plurality of construction oligonucleotides aredesigned such as each plurality of construction oligonucleotidescomprising a sequence region at its 5′ end that is complementary tosequence region of the 5′ end of another construction oligonucleotideand a sequence region at its 3′ end that is complementary to a. sequenceregion at a 3′ end of a different construction oligonucleotide. As usedherein, a “construction” oligonucleotide refers to one of the pluralityor population of single-stranded oligonucleotides used forpolynucleotide assembly. The plurality of construction oligonucleotidescomprises oligonucleotides for both the sense and antisense strand ofthe target polynucleotide. Construction oligonucleotides can have anylength, the length being designed to accommodate an overlap orcomplementary sequence. Construction oligonucleotides can be ofidentical size or of different sizes. In preferred embodiments, theconstruction oligonucleotides span the entire sequence of the targetpolynucleotide without any gaps. Yet in other embodiments, theconstruction oligonucleotides are partially overlapping resulting ingaps between construction oligonucleotides when hybridized to eachother. Preferably, the pool or population of constructionoligonucleotides comprises construction oligonucleotides havingoverlapping sequences so that construction oligonucleotides canhybridize to one another under the appropriate hybridization conditions.One would appreciate that each internal construction oligonucleotideswill hybridize to two different construction oligonucleotide whereas theconstruction oligonucleotides at the 5′ and/or 3′ end will hybridizeeach to a different (or the same) internal oligonucleotide(s).Hybridization and ligation of the overlapping constructionoligonucleotides will therefore result in a target polynucleotide havinga 3′ and/or a 5′ overhang. Yet in some embodiments, the resulting targetpolynucleotide may comprise blunt end at its 5′ or/and 3′ terminus. Insome embodiments, if the target polynucleotide is assembled from Nconstruction oligonucleotides, 1 to N pluralities of differentsupport-bound single-stranded oligonucleotides are designed such as thefirst plurality of construction oligonucleotides comprises at its 5′ enda sequence region that is complementary to a sequence region at the 5′end of an anchor oligonucleotide and wherein a N plurality ofconstruction oligonucleotides comprises at its 3′ end a sequence regionthat is complementary to a 3′ end sequence region of the (N−1)construction oligonucleotide. In some embodiments, the first pluralityof oligonucleotides has a 5′ end that is complementary to the 5′ end ofa support bound anchor single-stranded oligonucleotide. As used herein,the anchor oligonucleotide refers to an oligonucleotide designed to becomplementary to at least a portion of the target polynucleotide and maybe immobilized on the support in an exemplary embodiment, the anchoroligonucleotide has a sequence complementary to the 5′ end of the targetpolynucleotide and may be immobilized on the support.

In some aspects of the invention, the reagents in the reaction volumespromote oligonucleotide or polynucleotide assembly by polymerase chainextension or ligase-based assembly. In some embodiments, the reactionvolumes may contain two or more populations of single-strandedoligonucleotides having predefined sequences in solution. Thepopulations of oligonucleotides can hybridize to a single-strandedoligonucleotide attached to the wetted surface thereby formingdouble-stranded hybrids or duplexes attached to the surface. In someembodiments, the double-stranded hybrids contain breaks and gaps in thephosphodiester backbone, formed at the junctions of differentoligonucleotide populations. In some embodiments, a polymerase and dNTPsand other necessary components are added to fill the gaps in thebackbone. In other embodiments, a ligase and other necessary componentsare added to mend breaks in the backbone.

In some embodiments, two different or more oligonucleotides orpolynucleotides may be immobilized or synthesized at the same location(or feature) on the solid support thereby facilitating their interactionafter amplification within the same droplet. See e.g. US 2004/0101894.In some embodiments, droplets are merged to form bigger droplets byadding, or spotting additional “merger” droplets or volumes in betweenor around the appropriate original droplets. Two droplets, or isolatedvolumes can therefore merge if a “merger” droplet or volume is createdand expanded until the merge takes place. The resultant merged volumewill encompass the first stage droplets or first isolated volumes. Thevolume and location of the resulting merged volume can vary. The mergedvolumes (e.g. second stage droplet) can occupy^(,) a footprint that isthe combination of all volumes (e.g. first stage droplets and mergerdroplet). Alternatively, the merged volumes can occupy at least part ofthe footprint of one of the isolated volume (e.g. first or secondisolated volume.

Some aspects of the invention, relate to the destination selection androuting of the isolated volumes and therefore to the control of thelocation or footprint of merged volumes, One would appreciate that asindividual regions of the support are addressable, individual isolatedvolumes such as droplets may be controlled individually. In someembodiments, it is preferable to place isolated volumes onto adjacentregions or features to allow merging of the volumes. Yet, in otherembodiments, isolated volumes are directed or routed to a pre-selecteddestination.). In some case, the merged volumes occupy the footprint ofone of the isolated volume and extend to one or more smaller contactangle regions (SCA). in some embodiments, the substrate of the supportis substantially planar and droplets are routed using a two-dimensionalpath (e.g. x, y axis). Droplets may be moved to bring them to selectedlocations for further processing, to be merged with a second isolatedvolume into a second stage droplet at preselected locations and/orduring the transport, to remove some reactants from the droplet(referred herein “wash-in-transport” process).

In some embodiments, step-wise hierarchical and/or sequential assemblycan be used to assemble oligonucleotides and longer polynucleotides. Ina preferred embodiment, the methods use hierarchical assembly of two ormore oligonucleotides or two or more nucleic acids subassemblies at atime. Neighboring droplets can be manipulated (move and/or merged, asdescribed above) to merge following a hierarchical strategy therebyimproving assembly efficiency. In some embodiments, each dropletcontains oligonucleotides with predefined and different nucleic acidsequences. In some embodiments, two droplets are moved following apredefined path to an oligonucleotide-free position. In a preferredembodiment, the assembly molecules (e.g. oligonucleotides) arepre-arranged on the support surface at pre-determined discrete features.

One should appreciate that isolated volumes may be routed independentlyin a sequential or highly parallel fashion. Droplets may be routed usingelectrowetting-based techniques (see for example, U.S. Pat. No.6,911,132 and U.S. Patent Application 2006/0054503). Electrowettingprinciple is based on manipulating droplets on a surface comprising anarray of electrodes and using voltage to change the interfacial tension.By applying an electric field (e.g. alternating or direct), the contactangle between the fluid and surfaces can be modified. For example, byapplying a voltage, the wetting properties of a hydrophobic surface canbecome increasingly hydrophilic and therefore wettable. In someembodiments, the array of electrode is not in direct contact with thefluid, in some embodiments, droplets are moved using a wettabilitygradient. It has been shown that droplets placed on wettability gradientsurfaces typically move in the direction of increasing wettability (seeZielke and Szymczyk, Fur. Phys. J. Special Topics, 166, 155-158 (2009)).In other embodiments, droplets may be moved using a thermal gradient.When placed on a thermal gradient, droplets move from higher temperaturelocations towards lower temperature locations. Moving droplets usingelectrowetting, temperature gradients and wettability gradients dependon the liquid (e.g. aqueous, non-aqueous, solute concentration), thesize of the droplets and/or the steepness of the gradient.

One skilled in the art will appreciate that most of the electrowettingmerging and mixing strategies rely on the fact that droplets haveidentical volumes before merging. In some aspects of the invention,routing of the droplet and merging is controlled by the using differentsize droplets. In a preferred embodiment, the footprint of the mergedvolume is controlled by the size of the droplets before merging. In someembodiments, the method comprises moving the content of smaller volumedroplets to the position of larger volume droplets.

One skilled in the art will appreciate that the principle describedherein can be applied to move liquid volumes such as droplets on thesupport along a predetermined path and to determine the exact locationof the merged volume, In some embodiments, the content of the smallervolumes may repeatedly be moved to the position of larger volumes inorder to move liquid volumes over a distance that is larger than amerger region.

Reactions, include, but are not limited to incubation, enzymaticreactions, dilution, mixing, error reduction and/or assembly. Althoughthe figures show a linear, one dimensional, path, it should beappreciated that the droplet can be moved anywhere on the supportsurface, In some embodiments, the droplets are moved in a twodimensional direction. Any other operations derived form this protocolcan be envisioned. For example, droplets can be deposited sequentially,simultaneously, or in a parallel fashion. Droplets may contain onlywater and may be used as dilution droplets, Other droplets may contain asolute. Droplet content may be mixed by passive diffusion or activemixing. In some embodiments, at least two droplets are movedindependently following a similar path and are then moved towards afeature that is referred as a reaction feature wherein the droplets aremerged. The first and second droplet paths across the substrate mayfollow the same direction or may follow opposite directions. Forexample, a. first droplet may be moved toward a stationary seconddroplet or the first and the second droplet may be moved toward eachothers. Moreover, if two droplets have the same size, reduction of thesize of one droplet will enable it to move in the direction and to thelocation of the larger volume. Reduction of the size of the droplet canbe achieved by evaporation. Evaporation of liquid may be achieved usingany technique known in the art. For example, the isolated liquid volumeto be decreased may be heated to induce or accelerate evaporation.Alternatively, to merge a first droplet at the location of a seconddroplet, liquid may be added to the second droplet to increase its sizecomparatively to the first droplet.

Another benefit of the droplet movement process described herein is theimplementation of a “wash” operation (referred herein aswash-in-transportation). The movement of the liquid away from a surfacefeature allows the separation of the surface-bound molecules (e.g.oligonucleotides) from the molecules in solution. Hence, a washoperation is therefore implemented. For example, wash-in-transportationcan be used to remove the template oligonucleotides form thecomplementary oligonucleotides after amplification. In some embodiments,“wash-in transportation” features or wash spots may be placed adjacentto features where oligonucleotide processing takes place.

In some embodiments, the “merger” droplets or the “anchor” droplet maycontain or not contain enzyme (e.g. polymerase, ligase, etc.),additional oligonucleotides and all reagents to allow assembly by PCR orby ligation (enzymatic or chemical) or by any combination of enzymaticreaction. For example, oligonucleotides in a given droplet may hybridizeto each other and may assemble by PCR or ligation, The bigger dropletsor second stage droplets contain polynucleotides subassemblies and canbe subsequently merged to form larger droplets or third stage dropletcontaining larger fragments. As used herein the term subassembly refersto a nucleic acid molecule that has been assembled from a set ofoligonucleotides. Preferably a subassembly is at least 2-fold or morelong than the oligonucleotides. For example, a subassembly may be about100, 200, 300, 400, 500, 600, or ore bases long. One should appreciatethat the use of droplets as isolated reaction volumes enables a highlyparallel system. In some embodiments, at least 100, at least 1,000reactions can take place in parallel. In some embodiments, the primersare immobilized on the support in close proximity to the spotscontaining the oligonucleotides to be assembled. In some embodiments,the primers are cleaved in situ. in some embodiments, the primers aresupported on the solid support. The primers may then be cleaved in situand eluted within a droplet that will subsequently merged with a dropletcontaining solid supported or eluted oligonucleotides.

Some aspects of the invention relate to the transport of chargedmolecules such as nucleic acid (e.g. oligonucleotides orpolynucleotides) to a selected destination or selected feature on asupport within a fluid medium using a planar two dimensional path (x, yaxis). Preferably the molecules are electrophoretically transported bypolarization of the molecules of interest on application of a voltage,the charged molecule moving towards an electrode (anode or cathode). Insome embodiments, the array comprises one or more preferably, aplurality of electrophoretic planar microfluidic units, eachmicrofluidic unit comprising two electrodes. The electrodes systemcomprises at least one cathode and one anode. In some configurations,the cathode and anode are shared by a plurality of microfluidic units.In other configuration, the cathodes and anode arc for a singlemicrofluidic unit. The microfluidic units enable the displacement ofcharged molecules of interest according to an electrophoretic path. Insome embodiments, each microfluidic unit comprises at least on channel.Preferably, each microfluidic unit is fluidly connected. For example,each microfluidic unit may be connected to another microfluidic unit, bya channel. In preferred embodiments, an aqueous buffer is utilized asthe fluid in the device. In some embodiments, each microfluidic unit maycomprise a capture site. In some embodiment, the capture sitecorresponds to an array feature. Yet in other embodiment, the capturesite corresponds to an array interfeature. In some embodiments, thecapture site comprises a material that capture charged molecules. Innucleic acids, the phosphate ion carries a negative charge. Accordingly,preferably the capture site comprises a material that capture negativelycharged molecules. In some instances, the capture material may capturethe charged molecules of interest by chemically interaction throughcovalent bonding, hydrogen bonding, ionic bonding, Vander Waalsinteractions, or other molecular interactions. Alternatively, thecapture material does not interact with the molecules of interest butretards the molecule's electrophoretic transport. In some embodiments,at least a first feature and a second feature of the arrays are in fluidcommunication and the charged oligonucleotide or polynucleotide is movedbetween the first feature and a second feature by applying a voltagebetween the first and the second feature.

In certain embodiments, the oligonucleotides are designed to provide thefull sense (plus strand) and antisense (minus strand) strands of thepolynucleotide construct. After hybridization of the plus and minusstrand oligonucleotides, double-stranded oligonucleotides are subjectedto ligation in order to form a first subassembly product. Subassemblyproducts are then subjected to ligation to form a larger nucleic acid orthe full nucleic acid sequence.

Ligase-based assembly techniques may involve one or more suitable ligaseenzymes that can catalyze the covalent linking of adjacent 3′ and 5′nucleic acid termini (e.g., a 5′ phosphate and a 3′ hydroxyl of nucleicacid(s) annealed on a complementary template nucleic acid such that the3′ terminus is immediately adjacent to the 5′ terminus). Accordingly, aligase may catalyze a ligation reaction between the 5′ phosphate of afirst nucleic acid to the 3′ hydroxyl of a second nucleic acid if thefirst and second nucleic acids are annealed next to each other on atemplate nucleic acid). A ligase may be obtained from recombinant ornatural sources. A ligase may be a heat-stable ligase. In someembodiments, a thermostable ligase from a thermophilic organism may beused. Examples of thermostable DNA ligases include, but arc not limitedto: Tth DNA ligase (from Therms thermophiles, available from, forexample, Eurogentec and. GeneCraft); Pfu DNA ligase (a hyperthermophilicligase from Pyrococcus furiosus); Taq ligase (from Thermus aquaticus),9°Ligase, Ampligase®, any other suitable heat-stable ligase, or anycombination thereof. In some embodiments, one or more lower temperatureligases may be used (e.g., T4 DNA ligase). A lower temperature ligasemay be useful for shorter overhangs (e.g., about 3, about 4, about 5, orabout 6 base overhangs) that may not be stable at higher temperatures.

Non-enzymatic techniques can be used to ligate nucleic acids. Forexample, a 5′-end (e.g., the 5′ phosphate group) and a 3′-end (e.g., the3′ hydroxyl) of one or more nucleic acids may be covalently linkedtogether without using enzymes (e.g., without using a ligase). In someembodiments, non-enzymatic techniques may offer certain advantages overenzyme-based ligations. For example, non-enzymatic techniques may have ahigh tolerance of non-natural nucleotide analogues in nucleic acidsubstrates, may be used to ligate short nucleic acid substrates, may beused to ligate RNA substrates, and/or may be cheaper and/or more suitedto certain automated (e.g., high throughput) applications.

Non-enzymatic ligation may involve a chemical ligation. In someembodiments, nucleic acid termini of two or more different nucleic acidsmay be chemically ligated. In some embodiments, nucleic acid termini ofa single nucleic acid may be chemically ligated (e.g., to circularizethe nucleic acid). It should be appreciated that both strands at a firstdouble-stranded nucleic acid terminus may be chemically ligated to bothstrands at a second double-stranded nucleic acid terminus. However, insome embodiments only one strand of a first nucleic acid terminus may bechemically ligated to a single strand of a second nucleic acid terminus.For example, the 5′ end of one strand of a first nucleic acid terminusmay be ligated to the 3′ end of one strand of a second nucleic acidterminus without the ends of the complementary strands being chemicallyligated.

Accordingly, a chemical ligation may be used to form a covalent linkagebetween a 5′ terminus of a first nucleic acid end and a 3′ terminus ofa. second nucleic acid end, wherein the first and second nucleic acidends may be ends of a single nucleic acid or ends of separate nucleicacids. In one aspect, chemical ligation may involve at least one nucleicacid substrate having a modified end (e.g., a modified 5′ and/or 3′terminus) including one or more chemically reactive moieties thatfacilitate or promote linkage formation. In some embodiments, chemicalligation occurs when one or more nucleic acid termini are broughttogether in close proximity (e.g., when the termini are brought togetherdue to annealing between complementary nucleic acid sequences).Accordingly, annealing between complementary 3′ or 5′ overhangs (e.g.,overhangs generated by restriction enzyme cleavage of a double-strandednucleic acid) or between any combination of complementary nucleic acidsthat results in a 3′ terminus being brought into close proximity with a5′ terminus (e.g., the 3′ and 5′ termini are adjacent to each other whenthe nucleic acids are annealed to a complementary template nucleic acid)may promote a template-directed chemical ligation. Examples of chemicalreactions may include, but are not limited to, condensation, reduction,and/or photo-chemical ligation reactions. It should be appreciated thatin some embodiments chemical ligation can be used to produce naturallyoccurring phosphodiester internucleotide linkages,non-naturally-occurring phosphamide pyrophosphate internucleotidelinkages, and/or other non-naturally-occurring internucleotide linkages.

In some embodiments, the process of chemical ligation may involve one ormore coupling agents to catalyze the ligation reaction. A coupling agentmay promote a ligation reaction between reactive groups in adjacentnucleic acids (e.g., between a 5′-reactive moiety and a 3′-reactivemoiety at adjacent sites along a complementary template). In someembodiments, a coupling agent may be a reducing reagent (e.g.,ferricyanide), a condensing reagent such (e.g., cyanoimidazole, cyanogenbromide, carbodiimide, etc.), or irradiation (e.g., UV irradiation forphoto-ligation).

In some embodiments, a chemical ligation may be an autoligation reactionthat does not involve a separate coupling agent. In autoligation, thepresence of a reactive group on one or more nucleic acids may besufficient to catalyze a chemical ligation between nucleic acid terminiwithout the addition of a coupling agent (see, for example, Xu et al.,(1997) Tetrahedron Lett. 38:5595-8). Non-limiting examples of thesereagent-free ligation reactions may involve nucleophilic displacementsof sulfur on bromoacetyl, tosyl, or iodo-nucleoside groups (see, forexample, Xu et al., (2001) Nat. Biotech. 19:148-52). Nucleic acidscontaining reactive groups suitable for autoligation can be prepareddirectly on automated synthesizers (see. for example, Xu et al., (1999)Nucl. Acids Res. 27:875-81). In some embodiments, a phosphorothioate ata 3′ terminus may react with a leaving group (such as tosylate oriodide) on a thymidine at an adjacent 5′ terminus. In some embodiments,two nucleic acid strands bound at adjacent sites on a complementarytarget strand may undergo auto-ligation by displacement of a 5′-endiodide moiety (or tosylate) with a 3′-end sulfur moiety. Accordingly, insome embodiments the product of an autoligation may include anon-naturally-occurring internucleotide linkage (e.g., a single oxygenatom may be replaced with a sulfur atom in the ligated product).

In some embodiments, a synthetic nucleic acid duplex can be assembledvia chemical ligation in a one step reaction involving simultaneouschemical ligation of nucleic acids on both strands of the duplex. Forexample, a mixture of 5′-phosphorylated oligonucleotides correspondingto both strands of a target nucleic acid may be chemically ligated by a)exposure to heat (e.g., to 97° C.) and slow cooling to form a complex ofannealed oligonucleotides, and b) exposure to cyanogen bromide or anyother suitable coupling agent under conditions sufficient to chemicallyligate adjacent 3′ and 5′ ends in the nucleic acid complex.

In some embodiments, a synthetic nucleic acid duplex can be assembledvia chemical ligation in a two step reaction involving separate chemicalligations for the complementary strands of the duplex. For example, eachstrand of a target nucleic acid may be ligated in a separate reactioncontaining phosphorylated oligonucleotides corresponding to the strandthat is to be ligated and non-phosphorylated oligonucleotidescorresponding to the complementary strand. The non-phosphorylatedoligonucleotides may serve as a template for the phosphorylatedoligonucleotides during a chemical ligation (e.g., using cyanogenbromide). The resulting single-stranded ligated nucleic acid may bepurified and annealed to a complementary ligated single-stranded nucleicacid to form the target duplex nucleic acid (see, for example, Shabarovaet al., (1991) Nucl. Acids Res. 19:4247-51).

In one aspect, a nucleic acid fragment may be assembled in a polymerasemediated assembly reaction from a plurality of oligonucleotides that arecombined and extended in one or more rounds of polymerase-mediatedextensions. In some embodiments, the oligonucleotides are overlappingoligonucleotides covering the fill sequence but leaving single-strandedgaps that may be filed in by chain extension. The plurality of differentoligonucleotides may provide either positive sequences (plus strand),negative sequences (minus strand), or a combination of both positive andnegative sequences corresponding to the entire sequence of the nucleicacid fragment to be assembled. In sonic embodiments, one or moredifferent oligonucleotides may have overlapping sequence regions (e.g.,overlapping 5′ regions or overlapping 3′ regions). Overlapping sequenceregions may be identical (i.e., corresponding to the same strand of thenucleic acid fragment) or complementary (i.e., corresponding tocomplementary strands of the nucleic acid fragment). The plurality ofoligonucleotides may include one or more oligonucleotide pairs withoverlapping identical sequence regions, one or more oligonucleotidepairs with overlapping complementary sequence regions, or a combinationthereof. Overlapping sequences may be of any suitable length. Forexample, overlapping sequences may encompass the entire length of one ormore nucleic acids used in an assembly reaction. Overlapping sequencesmay be between about 5 and about 500 oligonucleotides long (e,g.,between about 10 and 100, between about 10 and 75, between about 10 and50, about 20, about 25, about 30, about 35, about 45, about 50, etc.).However, shorter, longer, or intermediate overlapping lengths may beused. It should be appreciated that overlaps between different inputnucleic acids used in an assembly reaction may have different lengths.

Polymerase-based assembly techniques may involve one or more suitablepolymerase enzymes that can catalyze a template-based extension of anucleic acid in a 5′ to 3′ direction in the presence of suitablenucleotides and an annealed template. A polymerase may be thermostable.A polymerase may be obtained from recombinant or natural sources. Insome embodiments, a thermostable polymerase from a thermophilic organismmay be used. In some embodiments, a polymerase may include a 3′→5′exonuclease/proofreading activity. In some embodiments, a polymerase mayhave no, or little, proofreading activity (e.g., a polyrnera.se may be arecombinant variant of a natural polymerase that has been modified toreduce its proofreading activity). Examples of thermostable DNApolymerases include, but are not limited to: Tag (a heat-stable DNApolymerase from the bacterium Thermus aquaticus); Pfu (a thermophilicDNA polymerase with a 3′→5′ exonuclease/proofreading activity fromPyrococcus furiiosus, available from for example Promega); VentR® DNAPolymerase and VentRO (exo-) DNA Polymerase (thermophilic DNApolymerases with or without a 3′→5′ exonuclease/proofreading activityfrom Thermococcus litoralis; also known as Th polymerase); Deep VentR®DNA Polymerase and Deep VentR® (exo-) DNA Polymerase (thermophilic DNApolymerases with or without a 3′→5′ exonuclease/proofreading activityfrom Pyrococcus species GB-D; available from New England Biolabs); KODHiFi (a recombinant Thermococcus kodakaraensis KODI DNA polymerase witha 3′→5′ exonuclease/proofreading activity, available from Novagen,);BIO-X-ACT (a mix of polymerases that possesses 5′-3′ DNA polymeraseactivity and 3′→5′ proofreading activity); Klenow Fragment (anN-terminal truncation of E. coli DNA Polymerase I which retainspolymerase activity, but has lost the 5′→3′ exonuclease activity,available from, for example, Promega and NEB); Sequenase™ (T7 DNApolymerase deficient in T-5′ exonuclease activity); Phi29 (bacteriophage29 DNA polymerase, may be used for rolling circle amplification, forexample, in a TempliPhi™ DNA Sequencing Template Amplification Kit,available from Amersham Biosciences); TopoTaq (a hybrid polymerase thatcombines hyperstable DNA binding domains and the DNA unlinking activityof Methanopyrus topoisomerase, with no exonuclease activity, availablefrom Fidelity Systems); TopoTaq HiFi which incorporates a proofreadingdomain with exonuclease activity; Phusion™ (a Pyrococcus-like enzymewith a processivity-enhancing domain, available from New EnglandBiolabs); any other suitable DNA polymerase, or any combination of twoor more thereof.

In some embodiments, the polymerase can be a SDP (strand-displacingpolymerase; e.g, an SDPe—which is an SDP with no exonuclease activity).This allows isothermal PCR (isothermal extension, isothermalamplification) at a uniform temperature. As the polymerase (for example,Phi29, Bst) travels along a template it displaces the complementarystrand (e.g., created in previous extension reactions). As the displacedDNAs are single-stranded, primers can hind at a consistent temperature,removing the need for any thermocycling during amplification, therebyavoiding or decreasing evaporation of the reaction mixture.

It should be appreciated that the description of the assembly reactionsin the context of the oligonucleotides is not intended to be limiting.For example, other polynucleotides (e.g. single-stranded,double-stranded polynucleotides, restriction fragments, amplificationproducts, naturally occurring polynucleotides, etc.) may be included inan assembly reaction, along with one or more oligonucleotides, in orderto generate a polynucleotide of interest.

Cloning and In Vitro Expression

Some aspects of the invention provides for methods, devices andcompositions for designing a protein having one or more desiredcharacteristics, such as a desired function or property. In someembodiments, proteins can be designed and/or screened in silico. A widevarieties of computational methods can be used to generate and/or screenlibraries of proteins to identify potential proteins or protein variantsthat can exhibit the desired characteristic. Once a number of proteinsor protein variants have been identified, nucleic acids encoding theplurality of proteins or protein variants can be synthesized and theplurality of proteins or protein variants can be expressed and/orscreened according to the methods disclosed herein to determine if theproteins have the desired characteristic.

In some embodiments, the invention provides arrays comprising librariesof proteins or variant proteins, with the library comprising at leastabout 100 different protein variants, with at least about 500 differentprotein variants being preferred, about 1000 different protein variants,about 10,000 different protein variants or more.

In some embodiments, a plurality of nucleic acids synthesized and/orassembled using the above methods and devices are expressed using invitro transcription and/or translation system. In preferred embodiments,the plurality of nucleic acids are generated on a support (e.g. anarray) and the plurality of proteins are expressed on the same or adifferent support (e.g. array). The methods described above makepossible the direct fabrication of nucleic acids of any desiredsequence. In some embodiments, the plurality of nucleic acids encodesnucleic acid library members. In various embodiments of the invention,the nucleic acids synthesized and/or assembled using the above methodsand devices can be introduced onto an appropriate vector by way ofcloning. For example, the resulting polynucleotides can be individuallycloned into an expression vector. The nucleic acid sequence may beinserted into the vector by a variety of procedures. In general, nucleicacids is inserted into an appropriate restriction endonuclease site(s)using techniques known in the art. Vector components generally include,but are not limited to, one or more of a signal sequence, an origin ofreplication, one or more marker genes, an enhancer element, a promoter,and a transcription termination sequence. Construction of suitablevectors containing one or more of these components employs standardligation techniques which are known to the skilled artisan. Suchtechniques are well known in the art and well described in thescientific and patent literature.

Various vectors are publicly available. The vector may, for example, bein the form of a plasmid, cosmid, viral particle, or phage. Bothexpression and cloning vectors contain a nucleic acid sequence thatenables the vector to replicate in one or more selected host cells. Suchvector sequences are well known for a variety of bacteria, yeast, andviruses. Useful expression vectors that can be used include, forexample, segments of chromosomal, non-chromosomal and synthetic DNAsequences. Suitable vectors include, but are not limited to, derivativesof SV40 and pcDNA and known bacterial plasmids such as col E1, pCR1,pBR322, pMal-C2, pET, pGEX as described by Smith, et al., Gene 57:31-40(1988), pMB9 and derivatives thereof, plasmids such as RP4, phage DNAssuch as the numerous derivatives of phage I such as NM98 9, as well asother phage DNA such as M13 and filamentous single stranded phage DNA;yeast plasmids such as the 2 micron plasmid or derivatives of the 2 mplasmid, as well as centomeric and integrative yeast shuttle vectors;vectors useful in eukaryotic cells such as vectors useful in insect ormammalian cells; vectors derived from combinations of plasmids and phageDNAs, such as plasmids that have been modified to employ phage DNA orthe expression control sequences; and the like, The requirements arethat the vectors are replicable and viable in the host cell of choice.Low- or high-copy number vectors may be used as desired.

In some embodiments, the synthetic sequences are cloned into cloningvectors. For example, the polynucleotide constructs may be introducedinto an expression vector and transformed or transfected into a hostcell. Any suitable vector may be used. Appropriate cloning vectorsinclude, but are not limited to, plasmids, phages, cosmids, bacterialvector, bacterial artificial chromosomes (BACs), P1 derived artificialchromosomes (PACs), YAC, P1 vectors and the like. Standard recombinantDNA and molecular cloning techniques used here are well known in the artand are described by Sambrook, J., Fritsch, E. F. and Maniatis, T.,Molecular Cloning: A_(—) Laboratory Manual, Second Edition, Cold SpringHarbor Laboratory Press, Cold Spring Harbor, N.Y. (1989) (hereinafter“Maniatis”); and by Silhavy, T. J., Berman, M. L. and Enquist, L. W.,Experiments with Gene Fusions, Cold Spring Harbor Laboratory Cold PressSpring Harbor, N.Y. (1984); and by Ausubel, F. M. et al., CurrentProtocols in Molecular Biology, published by Greene Publishing Assoc.and Wiley-Interscience (1987). In some embodiments, a vector may be avector that replicates in only one type of organism (e.g., bacterial,yeast, insect, mammalian, etc.) or in only one species of organism. Somevectors may have a broad host range. Some vectors may have differentfunctional sequences (e.g., origins or replication, selectable markers,etc.) that are functional in different organisms. These may be used toshuttle the vector (and any nucleic acid fragment(s) that are clonedinto the vector) between two different types of organism (e.g., betweenbacteria and mammals, yeast and mammals, etc.). In some embodiments, thetype of vector that is used may be determined by the type of host cellthat is chosen. Preferably, bacterium is used as a host cell and BACvectors are utilized because of their capability to contain long nucleicacid sequences insert, typically, 50 to 350 kb (see Zhao et al.,editors, Bacterial Artificial Chromosomes, Humana Press Totowa, N.J.2004, which is incorporated herein by reference).

In some embodiments, a library of promoter sequences is provided. Insome embodiments, the library of promoters comprises a plurality ofdifferent promoters. Different promoters' sequences may be related orunrelated. in an exemplary embodiment, the promoter sequences may beobtained from a bacterial source. Each promoter sequence may be nativeor foreign to the polynucleotide sequence which it is operably linkedto. Each promoter sequence may be any nucleic acid sequence which showstranscriptional activity in the host cell. A variety of promoters can beutilized. For example, the different promoter sequences may havedifferent promoter strength. In some embodiments, the library ofpromoter sequences comprises promoter variant sequences, In a preferredembodiment, the promoter variants cover a wide range of promoteractivities form the weak promoter to the strong promoter. A promoterused to obtain a library of promoters may be determined by sequencing aparticular host cell genome. Putative promoter sequences may be then beidentified using computerized algorithms such as the Neural Network ofPromoter Prediction software (Demeter et al. (Nucl. Acids. Res. 1991,19:1593-1599). Putative promoters may also be identified by examinationof family of genomes and homology analysis. The library of promoter maybe placed upstream of a single gene or operon or upstream. of a libraryof genes.

A host cell may be transformed with the resulting nucleic acidconstructs using any suitable technique (e.g., electroporation, chemicaltransformation, infection with a viral vector, etc.). Certain hostorganisms are more readily transformed than others, In some embodiments,all of the nucleic acid fragments and a linearized vector are mixedtogether and transformed into the host cell in a single step. However,in some embodiments, several transformations may be used to introduceall the fragments and vector into the cell (e.g., several successivetransformations using subsets of the fragments). It should beappreciated that the linearized vector is preferably designed to haveincompatible ends so that it can only be circularized (and therebyconfer resistance to a selectable marker) if the appropriate fragmentsare cloned into the vector in the designed configuration. This avoids orreduces the occurrence of “empty” vectors after selection. The nucleicacids may be introduced into the host cell by any means known in theart, including, but not limited to, transformation, transfection,electroporation, microinjection, etc. In particular non-limitingembodiments of the invention, one or more nucleic acid may be introducedinto a parental host cell, which is then propagated to produce a.population of progeny host cells containing the nucleic acids. Mini-prepcan be performed therefrom to purify the nucleic acids for furthertesting (e.g., sequencing). Clones having the correct or desired nucleicacids can be subcloned for in vitro protein synthesis or in vitrotranscription/translation.

The nucleic acid constructs can be constructed to include appropriatepromoter and translation sequences for in vitro protein synthesis or invitro transcription/translation. Any suitable promoter can be used, suchas the ara B, tac promoter, T7, T3 or SP6 promoters amongst others. Thepromoter is placed so that it is operably linked to the DNA sequences ofthe invention such that such sequences are expressed.

An In Vitro Protein Synthesis (IVPS) system, in general, includes cellextracts that support the synthesis of proteins in vitro from purifiedmRNA transcripts or from mRNA transcribed from DNA during the in vitrosynthesis reaction. Such protein synthesis systems generally include anucleic acid template that encodes a protein. of interest. The nucleicacid template is an RNA molecule (e.g., mRNA) or a nucleic acid thatencodes an mRNA (e.g., RNA, DNA) and be in any form (e.g., linear,circular, supercoiled, single stranded, double stranded, etc.). Nucleicacid templates guide production of the desired protein. IVPS systems canalso be engineered to guide the incorporation of detectably labeledamino acids, or unconventional or unnatural amino acids, into a. desiredprotein.

In a generic IVPS reaction, a gene encoding a protein of interest isexpressed in a transcription buffer (e.g., having appropriate salts,detergents and pH), resulting in mRNA. that is translated into theprotein of interest in an IVPS extract and a translation buffer (e.g.,having appropriate salts, detergents and pH). The transcription buffer,IVPS extract and translation butler can be added separately, or two ormore of these solutions can be combined before their addition, or addedcontemporaneously. To synthesize a protein of interest in vitro, an IVPSextract generally at some point comprises a mRNA molecule that encodesthe protein of interest. In early IVPS experiments, mRNA was addedexogenously after being purified from natural sources or preparedsynthetically in vitro from cloned DNA using bacteriophage RNApolymerases. Its other systems, the mRNA is produced in vitro from atemplate DNA; both transcription and translation occur in this type ofIVPS reaction. Techniques using coupled or complementary transcriptionand translation systems, which carry out the synthesis of both RNA andprotein in the same reaction, have been developed. In such in vitrotranscription and translation (IVTT) systems, the IVPS extracts containail the components necessary both for transcription (to produce mRNA)and for translation (to synthesize protein) in a single system. An earlyIVTT system was based on a bacterial extract (Lederman and Zubay,Biochim, Biophys. Acta, 149: 253, 1967). In IVTT systems, the inputnucleic acid is DNA, which is normally much easier to obtain than mRNA,and more readily manipulated (e.g., by cloning, site-specificrecombination, and the like).

An IVTT reaction mixture typically comprises the following components: atemplate nucleic acid, such as DNA, that comprises a gene of interest(GOI) operably linked to at least one promoter and, optionally, one ormore other regulatory sequences (e.g., a cloning or expression vectorcontaining the GOI); an RNA polymerase that recognizes the promoter(s)to which the GOI is operably linked and, optionally, one or moretranscription factors directed to an optional regulatory sequence towhich the template nucleic acid is operably linked; ribonucleotidetriphosphates (rNTPs); ribosomes; transfer RNA (tRNA); optionally, othertranscription factors and co-factors therefor; amino acids (optionallycomprising one or more detectably labeled amino acids); one or moreenergy sources, (e.g., ATP, GTP); and other or optional translationfactors (e.g., translation initiation, elongation and terminationfactors) and co-factors therefor.

In some aspects, the invention relates to high throughput expression ofproteins using in vitro transcription/translation. In preferredembodiments, the methods and devices use minimized sample volumes, suchas such as microvolumes, nanovolumes, picovolumes or sub-picovolumes.Accordingly, aspects of the invention relate to methods and devices foramplification and/or assembly of polynucleotide sequences and ofexpression of proteins in small volume droplets on separate andaddressable features of a support. In some embodiments, predefinedreaction microvolumes of between about 0.5 pL and about 100 nL may beused. However, smaller or larger volumes may be used. One wouldappreciate that the minimized sample volume increases the number ofsamples that can be processed in an efficient and parallel manner.Methods and devices of the present invention provide for minimizing thevolume of a reaction while controlling the loss of liquid due toevaporation as discussed herein. In some embodiment, thetranscription/translation reactions can be generated or performed withina vessel of minimized proportions. Minimizing the Vessel size inrelation to reaction volume can reduce the effects of evaporation. Forexample, the transcription/ translation reaction can take place on asolid surface or support, such as an array. in some embodiments, thetranscription/translation reactions can be performed on the same supportthan the support for nucleic acid assembly. For example, thetranscription/translation reactions can be performed at a differentfeature or area than the assembly reactions. Yet in another embodiment,the transcription/translation reactions are performed on a differentsupport.

In some embodiments, a library of synthetic nucleic acid constructsintegrated into a plasmid or in linear form can be transferred toindividual wells of a micro-well plate or at specific location on ssupport, containing the appropriate transcription/translation reactionreagents. In some embodiments, a droplet based dispensing apparatus canbe used to dispense droplets of transcription/translation reagentsdirectly onto specific locations or distinct features of the solidsurface, covering the deposited nucleic acid constructs, forming a selfcontained. reaction volume. For example, droplets can be dispensed onone, two or all features having nucleic acids. The support with thereaction volumes can be inserted into a humidity controlled chamber asdescribed herein to counteract evaporation. The chamber can control ahumidity therein and minimize evaporation of the reaction volumes. Otherhumidity control mechanisms can also be used, such as “sacrificial”droplets placed around or in close proximity to the droplet of interest,a sealing layer or lid, etc.

In some embodiments, the reaction reagents can be tailored to thespecific type of protein being expressed. For example, prokaryoticproteins can be expressed using bacterial reagents, such as an E. coillysate based expression systems, e.g., the PureExpress® system from NewEngland Biolabs. Eukaryotic proteins can also be expressed withcompatible systems, including wheat germ (e.g., T7 coupled reticulocytelysate TNT™ system (Promega, Madison, Wis.)) and erythrocyte lysatebased expression systems. Incubation of the reaction under appropriatereaction conditions, such as 37° C., can be followed by an analysis ofprotein expression. In some embodiments, synthesized protein productscan be separated using gel electrophoresis and visualized or imaged bystaining or Western blot.

In some embodiments, the presence of proteins of interest can beassessed by measuring protein activity. A variety of protein activitiescan be assayed. Non-limiting examples include binding activity (e.g.,specificity, affinity, saturation, competition), enzyme activity(kinetics, substrate specificity, product, inhibition), etc. Exemplarymethods include, but are not limited to, spectrophotometric,colorimetric, fluorometric, calorimetric, chemiluminescent, lightscattering, radiometric, chromatographic methods,

Automation

Aspects of the methods and devices provided herein may includeautomating o or more acts described herein. In some embodiments, one ormore steps of an amplification and/or assembly reaction may be automatedusing one or more automated sample handling devices (e.g., one or moreautomated liquid or fluid handling devices). Automated devices andprocedures may be used to deliver reaction reagents, including one ormore of the following: starting nucleic acids, buffers, enzymes (e.g.,one or more ligases and/or polymerases), nucleotides, salts, and anyother suitable agents such as stabilizing agents. Automated devices andprocedures also may be used to control the reaction conditions. Forexample, an automated thermal cycler may be used to control reactiontemperatures and any temperature cycles that may be used. in someembodiments, a scanning laser may be automated to provide one or morereaction temperatures or temperature cycles suitable for incubatingpolynucleotides. subsequent analysis of assembled polynucleotideproducts may be automated. For example, sequencing may be automatedusing a sequencing device and automated sequencing protocols. Additionalsteps (e.g., amplification, cloning, etc.) also may be automated usingone or more appropriate devices and related protocols. it should beappreciated that one or more of the device or device componentsdescribed herein may be combined in a system (e.g., a robotic system) orin a micro-environment (e.g., a micro-fluidic reaction chamber).Assembly reaction mixtures (e.g., liquid reaction samples) may betransferred from one component of the system to another using automateddevices and procedures (e.g., robotic manipulation and/or transfer ofsamples and/or sample containers, including automated pipetting devices,micro-systems, etc.). The system and any components thereof may becontrolled by a control system.

Accordingly, method steps and/or aspects of the devices provided hereinmay be automated using, for example, a computer system (e.g., a computercontrolled system). A computer system on which aspects of the technologyprovided herein can be implemented may include a computer for any typeof processing (e.g., sequence analysis and/or automated device controlas described herein). However, it should be appreciated that certainprocessing steps may be provided by one or more of the automated devicesthat are part of the assembly system. in some embodiments, a computersystem may include two or more computers. For example, one computer maybe coupled, via a network, to a second computer. One computer mayperform sequence analysis. The second computer may control one or moreof the automated synthesis and assembly devices in the system. In otheraspects, additional computers may be included in the network to controlone or more of the analysis or processing acts. Each computer mayinclude a memory and processor. The computers can take any form, as theaspects of the technology provided herein are not limited to beingimplemented on any particular computer platform. Similarly, the networkcan take any form, including a private network or a public network(e.g., the Internet). Display devices can be associated with one or moreof the devices and computers. Alternatively, or in addition, a displaydevice may be located at a remote site and connected for displaying theoutput of an analysis in accordance with the technology provided herein.Connections between the different components of the system may be viawire, optical fiber, wireless transmission, satellite transmission, anyother suitable transmission, or any combination of two or more of theabove.

Each of the different aspects, embodiments, or acts of the technologyprovided herein can be independently automated and implemented in any ofnumerous ways. For example, each aspect, embodiment, or act can beindependently implemented using hardware, software or a combinationthereof. When implemented in software, the software code can be executedon any suitable processor or collection of processors, whether providedin a. single computer or distributed among multiple computers. It shouldbe appreciated that any component or collection of components thatperform the functions described above can be generically considered asone or more controllers that control the above-discussed functions. Theone or more controllers can be implemented in numerous ways, such aswith dedicated hardware, or with general purpose hardware (e.g., one ormore processors) that is programmed using microcode or software toperform the functions recited above.

In this respect, it should be appreciated that one implementation of theembodiments of the technology provided herein comprises at least onecomputer-readable medium (e.g., a computer memory, a floppy disk, acompact disk, a tape, etc.) encoded with a computer program (i.e., aplurality of instructions), which, when executed on a processor,performs one or more of the above-discussed functions of the technologyprovided herein. The computer-readable medium can be transportable suchthat the program stored thereon can be loaded onto any computer systemresource to implement one or more functions of the technology providedherein. In addition, it should be appreciated that the reference to acomputer program which, when executed, performs the above-discussedfunctions, is not limited to art application program running on a hostcomputer. Rather, the term computer program is used herein in a genericsense to reference any type of computer code (e.g., software ormicrocode) that can be employed to program a processor to implement theabove-discussed aspects of the technology provided herein.

It should be appreciated that in accordance with several embodiments ofthe technology provided herein wherein processes are stored in acomputer readable medium, the computer implemented processes may, duringthe course of their execution, receive input manually (e.g., from auser).

Accordingly, overall system-level control of the assembly devices orcomponents described herein may be performed by a system controllerwhich may provide control signals to the associated nucleic acidsynthesizers, liquid handling devices, thermal eyelets, sequencingdevices, associated robotic components, as well as other suitablesystems for performing the desired input/output or other controlfunctions. Thus, the system controller along with any device controllerstogether form a controller that controls the operation of a nucleic acidassembly system. The controller may include a general purpose dataprocessing system, which can be a general purpose computer, or networkof general purpose computers, and other associated devices, includingcommunications devices, modems, and/or other circuitry or components toperform the desired input/output or other functions, The controller canalso be implemented, at least in part, as a single special purposeintegrated circuit (e.g., ASIC) or an array of ASICs, each having a mainor central processor section for overall, system-level control, andseparate sections dedicated to performing various different specificcomputations, functions and other processes under the control of thecentral processor section, The controller can also be implemented usinga plurality of separate dedicated programmable integrated or otherelectronic circuits or devices, e.g., hard wired electronic or logiccircuits such as discrete element circuits or programmable logicdevices. The controller can also include any other components ordevices, such as user input/output devices (monitors, displays,printers, a keyboard, a user pointing device, touch screen, or otheruser interface, etc.), data storage devices, drive motors, linkages,valve controllers, robotic devices, vacuum and other pumps, pressuresensors, detectors, power supplies, pulse sources, communication devicesor other electronic circuitry or components, and. so on. The controlleralso may control operation of other portions of a system, such asautomated client order processing, quality control, packaging, shipping,billing, etc., to perform other suitable functions known in the art butnot described in detail herein.

Various aspects of the present invention may be used alone, incombination, or in a variety of arrangements not specifically discussedin the embodiments described in the foregoing and is therefore notlimited in its application to the details and arrangement of componentsset forth in the foregoing description or illustrated in the drawings.For example, aspects described in one embodiment may be combined in anymanner with aspects described in other embodiments.

Use of ordinal terms such as “first,” “second,” “third,” etc., in theclaims to modify a claim element does not by itself connote anypriority, precedence, or order of one claim element over another or thetemporal order in which acts of a method are performed, but are usedmerely as labels to distinguish one claim element having a certain namefrom another element having a same name (but for use of the ordinalterm) to distinguish the claim elements.

Also, the phraseology and terminology used herein is for the purpose ofdescription and should not be regarded as limiting. The use of“including,” “comprising,” or “having,” “containing,” “involving,” andvariations thereof herein, is meant to encompass the items listedthereafter and equivalents thereof a.s well as additional items.

EXAMPLES

Some aspects of the present invention relate to performing proteinexpression in vitro. in some embodiments, after synthesis and assemblyon a surface or solid support, the nucleic acids can be amplified and/orcloned. The resulting nucleic acid constructs with the appropriateregulatory sequences can be mixed with the reagents required fortranscription and translation, for the on-surface production of thecorresponding proteins encoded by the nucleic acids. These in vivoprotein expression reactions can be performed, for example, at amicro-volume scale and in a mass parallel manner, thereby saving timeand costs. These advantages make in vitro on-surface production ofproteins an attractive platform for the high throughput production ofprotein libraries. Methods and devices of the present invention alsoprovide an advantageous platform for high throughput enzyme assays of invitro produced protein libraries and in vitro development ofbio-processing pathways using combinatorial protein libraries.

The following examples illustrate some embodiments in accordance withthe present invention, where methods and devices for synthetic nucleicacid synthesis and highly parallel in vitro protein library productionon solid surfaces are provided.

Example 1: Building Genetic Constructs from Microarray Based Material

FIGS. 1A-D are schematic drawings of surface attached synthesis ofnucleic acid constructs using microarray sourced oligonucleotidebuilding blocks and ligation assembly.

Genes to be expressed in the platform can be produced usingoligonucleotides generated from enzymatic manipulation of nucleic acidmicroarrays (FIG. 1A, 10). On the microarray are groups ofoligonucleotides (20) containing the genetic information required tobuild a target nucleic acid construct. Specifically, a universal primer(30) that hybridizes to the single stranded DNA (40) that makes up themicroarray is introduced along with a polymerase such as the Kienowfragment of DNA polymerase I from Escherichia coli (50) and otherreaction components including salts, buffers and deoxynucleotides. Theuniversal primer (30) can include sequences that can be recognized andcleaved by appropriate enzymes (70), These components are used to createa complementary second strand of DNA (60), essentially copying theinformation contained within the nucleic acid on the microarray onto arecoverable molecule. Using enzymes (70) that cleave the universalprimer (30), the universal primer can be fragmented (80) and removed bywashing the microarray surface.

The copied single-stranded construction oligonucleotides can then bereleased in solution, thereby forming a pool of constructionoligonucleotides in solution. For example, the constructionoligonucleotides can be eluted by heating the microarray in a buffersolution at high temperature, for instance 95° C., resulting in aplurality of populations of oligonucleotides (FIG. 1B, 65). Eachpopulation of oligonucleotides (1, 2, 3, 4, . . . , N+1,N+2, N+3, N+4,N+5) can have a predefined sequence, each complementary to theoligonucleotide sequence on the corresponding spat or feature on thesolid surface. Together the plurality of populations of oligonucleotides(65) can constitute nucleic acids of interest. Preferably, eachpopulation of construction oligonucleotides has a sequence complementaryto a next population of oligonucleotides. For example, oligonucleotide 2has a first sequence complementary to a termines sequence ofoligonucleotide 1 and has a second distinct sequence complementary to asequence of oligonucleotide 3. In certain embodiments, a plurality ofnucleic acids of interest (e.g., 5, 10, 20, 30, 50, 100, 200, 500, 1000,10⁴, 10⁵, 10⁶, 10⁷, etc.) can be produced from a single microarray. Insome embodiments, a library of nucleic acid variants (e.g., 5, 10, 20,30, 50, 100, 200, 500, 1000, 10⁴, 10⁵, 10⁶, 10⁷, etc.) can be producedfrom a single microarray.

Eluted oligonucleotides (65) can be transferred to a second area of thesame array or an area of a second array having that anchoroligonucleotides immobilized on its surface (e.g, oligonucleotides A,FIG. 1C, 90). Anchor oligonucleotides can be complementary to and canhybridize with the terminus, such as the 5′ or the 3′ terminus of theDNA construct to be assembled (e.g., free oligonucleotide 1). The anchoroligonucleotides (oligonucleotides A, 90) can act as points ofnucleation for the construction of the DNA product on the surface of themicroarray (FIG. 1D, 100). In some embodiments, oligonucleotide 1 has asequence complementary to the 5′ end of the anchor oligonucleotide and asequence complementary to the 3′ end of oligonucleotide 2. In someembodiments, the construction oligonucleotides have a 3′ terminussequence complementary to a 5′ terminus sequence of a firstoligonucleotide and a distinct 5′ terminus sequence complementary to the3′ terminus sequence of a second oligonucleotide. For example,oligonucleotides A, 1, 2, 3, 4, . . . , N+1, N±2, N+3, N+4, N+5 cansequentially hybridize with one another where complementary sequencesare present, The hybridization conditions (e.g., temperature, saltsconcentration, buffer pH, etc.) can be adjusted to allow stringent orrelaxed hybridization.

Construction oligonucleotide 1, 2, 3, 4, . . . , N+1, N+2, N+3, N+4,N+5) of the target nucleic acid product can be incubated on the surfaceof the array, in microvolumes (e.g., individual droplet on each spot, ora plurality of droplets combined), in the presence of ligase andappropriate buffer components. Ligation can be achieved using a varietyof different commercially available ligase enzymes, Low temperature (37°C. or lower) ligation can be performed using T4 DNA Ligase, availablefrom New England Biolabs. High temperature (greater than 37° C.)ligation can be performed using thermophilic or thermostable ligases,such as Taq ligase and 9° N Ligase, available from New England Biolabs,and Ampligase®, available from Epicentre® Biotechnologies, according tomanufacturer's instructions, After incubation at the appropriatetemperature, unincorporated oligonucleotides and/or unwanted reagentscan be removed by one or more rounds of washing. In this way,oligonucleotides (65) can be assembled into a full length nucleic acidof interest. It should be appreciated that a plurality of nucleic acidsof interest can be produced in a highly parallel manner on one or morearrays.

In certain embodiments, the nucleic acid of interest can include, inaddition to e.g., the gene of interest that encodes a protein ofinterest, other genetic elements that regulate transcription and/ortranslation or facilitate selection. For example, genetic elements suchas the promoter, ribosome binding site, transcriptional terminator, aswell as selectable markers can be included in synthetic microarray andincorporated into the final synthetic DNA construct.

Example 2: High Throughput Expression of High Content Gene Libraries

Assembled nucleic acids can then amplified using gene specific primersand the polymerase chain reaction. Full length constructs can then becloned into appropriate vectors for sequence confirmation, Cloning canefficiently be accomplished using commercially available kits, such asthe TOPO® line of cloning kits from Invitrogen or the StrataClone™ linefrom Agilent Technologies. Resultant plasmids can be transformed into E.coli, purified and sequenced to identify constructs with the desiredsequence. Constructs with the appropriate genetic elements fortranscription/translation can also be directly sequenced without cloningand transformation.

Nucleic acid constructs with the correct sequence can be cloned and/orsubcloned into a plasmid containing the appropriate genetic elements(e.g., promoter, ribosome binding site, transcriptional terminator) forthe in vitro expression of the corresponding proteins. Alternatively, inembodiments where the genetic elements have been included in the finalsynthetic nucleic acid construct, any plasmid can be used for cloningthe construct. Constructs with the appropriate genetic elements can alsobe used in a linear form and subject to in vitro expression, without theneed for cloning into a. destination plasmid.

Various in vitro transcription/translation systems described herein canbe used. For example, for expression of eukaryotic proteins, extract ofwheat germ can be used, e.g., the PureExpress® system New EnglandBiolabs, Ipswich, Mass.) and the T7 coupled reticulocyte lysate TNT™system (Promega, Madison, Wis.). The in vitro expression can beperformed in various formats, e.g., in a multi-well plate or on a solidsurface support (e.g. array).

FIGS. 2A-2C are schematic drawings of in vitro transcription andtranslation of synthetic nucleic acid (e.g. genes of interest) inmicro-well plate or on solid surface support formats.

For the system described in FIG. 2A, a T7 promoter (100) and an E. coliribosome binding site (110) can be included upstream of a gene, ofinterest (120). Downstream of the gene of interest is a transcriptionalterminator (130). Other suitable promoters can include SP6 or T3.Commercially available products can be used and optimized for expressionof proteins. For instance, the pET line of bacterial expression plasmidsfrom EMD Chemicals Group can be used as a destination plasmid for invitro expression of proteins in a bacterial expression system using theT7 promoter.

The present invention, in one embodiment, features high throughputexpression of proteins using in vitro transcription/translation that canbe facilitated by a minimization of sample volumes, The minimized samplevolume increases the number of samples that can be processed in anefficient and parallel manner. Methods and devices of the presentinvention provide for minimizing the volume of a reaction whilecontrolling the loss of liquid due to evaporation as discussed herein.In one embodiment, the transcription/translation reaction can hegenerated or performed within a vessel of minimized proportions. Inanother embodiment, the transcription/translation reaction can takeplace on a solid surface or support.

Minimizing the vessel size in relation to reaction volume can reduce theeffects of evaporation. For example, a library of synthetic nucleic acidconstructs integrated into a plasmid (140) or in linear form (150) canbe transferred to individual wells of a micro-well plate containing theappropriate transcription/translation. reaction reagents (160). Thereaction reagents can be tailored to the specific type of protein beingexpressed. For example, prokaryotic proteins can be expressed usingbacterial reagents, such as an E. coli lysate based expression systems,e.g., the PureExpress® system from New England Biolabs. Eukaryoticproteins can also be expressed with compatible systems, including wheatgerm (e.g., T7 coupled reticulocyte lysate TNT™ system (Proinega,Madison, Wis.)) and erythrocyte lysate based expression systems.Incubation of the reaction at the appropriate temperature, such as 37°C. can be followed by an analysis of successful expression of theproteins encoded by the gene library (170). This analysis can take theform of direct inspection of the protein products using gelelectrophoresis and visualization of separated proteins using stainingand imaging. For example, proteins can be separated by gelelectrophoresis and visualized by staining the gel (e.g. CoomassieBrilliant Blue R-250 or silver stain), allowing visualization of theseparated proteins, or processed further (e.g. Western blot).Measurement of protein activity can also be used to confirm the presenceof the expressed proteins.

Other methods and devices for performing high throughput proteinexpression can involve the transcription/translation reaction takingplace on a solid surface (180). For instance, a library of linearnucleic acid constructs can be hybridized to anchor oligonucleotidespresent on the surface (180) allowing retention of the linear nucleicacid construct (190) on the solid surface. A droplet based dispensingapparatus (200) as described herein can be used to dispense droplets oftranscription/translation reagents directly onto specific locations ofthe solid surface (180), covering the deposited nucleic acid constructs(190), forming a self contained reaction volume (220). The solid surfacewith the reaction mixtures can be inserted into a humidity controlledchamber (230) as described herein to counteract evaporation. The chambercan control a humidity therein and minimize evaporation of the reactionmixtures. Other humidity control mechanisms can also be used, such as“sacrificial” droplets placed around or in close proximity to thedroplet of interest, a sealing layer or lid, etc. The expression ofproteins (170) from the nucleic acid library can then be assayed throughdirect inspection by gel electrophoresis and staining, or measurement ofprotein activity.

A variety of protein activities can be assayed. Non-limiting examplesinclude binding activity (e.g., specificity, affinity, saturation.,competition), enzyme activity (kinetics, substrate specificity, product,inhibition), etc. Exemplary methods include, but are not limited to,Spectrophotometric, colorimetric, fluorometric, calorimetric,chemiluminescent, light scattering, radiometric, chromatographicmethods.

Example 3: High Throughput Analysis of On-Surface Expressed EnzymeActivity

FIGS. 3A-3D are schematic drawings of high throughput enzyme assaysusing in vitro expression of synthetic nucleic acid constructs on asolid support surface and analysis by liquid chromatography linked withmass spectrometry.

High throughput enzyme assays can be used to measure the activity ofproteins expressed in vitro on a solid surface, as shown in FIGS. 3A-3D.Synthetic nucleic acid constructs (300) with the appropriate promoterand/or ribosome binding site are anchored to the solid support (310) andcovered with a droplet of transcription/translation reagent mix (320)and incubated in a humidity controlled chamber to minimize evaporation(330), as described herein. After completion of the protein expressionphase of the process (FIG. 3B), the solid surface is removed from thehumidity chamber and protein activity can be assessed. In someembodiments, a droplet based dispensing apparatus (340) can dispensereagents appropriate the supporting enzyme activity at specific location(or features) on the support or solid surface. For example, reagents canbe added in a drop wise manner (350). The solid surface is then placedback in the humidity control chamber (FIG. 3C) which can act as areaction chamber. Under suitable conditions (e.g suitable temperatureconditions) the enzyme (355) converts the substrate (360) to product(370). When the enzymatic reaction is complete (FIG. 3D) the solidsurface can be removed from the humidity control chamber and thepresence of product can be assayed. Sampling can be performed by anautomated sampling device including a sample uptake tube (380), a pump(385) and the appropriate separation method, for example achromatography column (390) capable of separating the enzyme reactionproduct from the substrate, reactants or contaminating molecules. Thepresence of the enzyme reaction product can be detected in a variety ofmanners, including fluorescence emissions, absorbance of light or massspectrometry (395). Other technologies, such as the RapidFire@technology from Biocius Life Sciences can also be used to allow highthroughput measurement of reaction products, analyzing thousands ofreactions per day (see, e.g., U.S. Pat. Nos. 6,932,939, 6,812,030,7,100,460, and 7,588,725).

Example 4: Combinatorial Multi-Enzyme Pathway Development UsingOn-Surface Expression

FIGS. 4A-4D are schematic drawings of multi-gene pathway developmentusing synthetic genes and combinatorial libraries of said genes.

On-surface gene synthesis as described herein can be used to generate alarge number of different constructs at low cost. Coupled withon-surface protein expression as described herein, the present inventionprovides an enzymatic pathway development platform using highly complexcombinatorial libraries. As shown in FIGS. 4A-4D, a microarray (400)contains the oligonucleotides (410) needed to build a set of genes (420)expressing a set of proteins that make up an enzymatic pathway ofinterest (e.g., biosynthetic pathway). Each of these genes (420) can bemodified in a variety of ways, giving diversity to the combinatoriallibrary. For instance, the genes can include mutations, be placed undercontrol of different regulatory sequences, and/or be expressed as fusionproteins having selectable markers (e.g., fluorescent marker, bindingpartners, affinity tags, chromatography tags, epitopes, HA-tag, etc).

In one example, each gene can placed under the control of two differentpromoters (430) giving high (pH) or low (pL) levels of relativeexpression. Resultant linear nucleic acid constructs (440) are mixedtogether in a well plate or on a feature of an array (FIG. 4B, 450) togive all the potential combinations (460) in a plurality of wells orfeatures (e.g., M1 to M8). These mixtures of linear constructs can beimmobilized (FIG. 4C, 465) on the surface of a solid support (467). Asdescribed above, a transcription/translation reaction mixture can bedispensed onto the individually arrayed combinations of genes M1-M8(468). The solid support can be incubated within a humidity controlledchamber (470) to minimize evaporation at the appropriate temperature, asdescribed herein. Within each of the reaction volume (472) transcriptionand translation reactions can result in the production of the proteincombinations (475) encoded by the arrayed gene combinations,representing, e.g., a metabolic pathway of interest or a biosyntheticpathway of interest. it should be appreciated that a plurality of genecombinations representing a plurality of metabolic pathways of interestcan be produced in a highly parallel manner in accordance with thepresent invention.

Once the transcription/translation reaction is complete, the solidsupport (467) can he removed from the humidity chamber (470) and adroplet based dispensing apparatus (480) can be used to dispensereagents (490) containing appropriate substrates (492) and otherreaction components such as buffers, salts and cofactors required foractivity assay of the protein combinations (475) within the reaction.The protein combinations (475), if present at appropriate combinationand/or concentration, can convert the substrates (492) to product (495)under appropriate reaction conditions. The product (495) can then beanalyzed for instance using column chromatography and mass spectrometry,as described above. Thus, activities of the mixtures M1 to M8 can beprepared side by side, to identify the gene/protein combination that ismost effective at converting substrate to product.

EQUIVALENTS

The present invention provides among other things novel methods anddevices for protein arrays. While specific embodiments of the subjectinvention have been discussed, the above specification is illustrativeand not restrictive. Many variations of the invention will becomeapparent to those skilled in the art upon review of this specification.The full scope of the invention should be determined by reference to theclaims, along with their full scope of equivalents, and thespecification, along with such variations.

INCORPORATION BY REFERENCE

Reference is made to PCT application PCT/US09/55267; U.S. provisionalapplication 61/257,591 filed Nov. 3, 2009; US provisional application61/264,643 filed Nov. 25, 2009; U.S. provisional application 61/264,632filed Nov. 25, 2009; U.S. provisional application 61/264,641 filed Nov.25, 2009; U.S. provisional application 61/293,192 filed Jan. 7, 2010;U.S. provisional application 61/310,100 filed Mar. 3, 2010; and U.S.provisional application 61/310,100 filed Mar. 3, 2010. All publications,patents and patent applications and sequence database entries mentionedherein arc hereby incorporated by reference in their entirety as if eachindividual publication, patent or patent application is specifically andindividually indicated to be incorporated by reference.

What is claimed is;:
 1. A method for preparing a protein array having aplurality of proteins, the method comprising: (a) providing a pluralityof nucleic acids each having a predefined sequence; and (b) expressingin vitro a plurality of proteins from the plurality of nucleic acids,wherein the plurality of proteins are expressed on an array.
 2. Themethod of claim 1 further comprising (c) measuring an activity of eachof the plurality of proteins.
 3. The method of claim 1, wherein theplurality of nucleic acids are synthesized on a solid surface.
 4. Themethod of claim 1, wherein each of the plurality of nucleic acidscomprises a regulatory genetic sequence.
 5. The method of claim 1,wherein each of the plurality of proteins is expressed in vitro in awell of a micro-well plate.
 6. The method of claim 1, wherein each ofthe plurality of proteins is expressed in vitro at a different featureof a solid surface.
 7. A method for preparing a protein array having aplurality of proteins, the method comprising: (a) providing a firstmicrovolume comprising a population of nucleic acids having a pluralityof distinct, predefined sequences; (b) immobilizing the nucleic acidsequences onto an array comprising a plurality of anchoroligonucleotides having a sequence complementary to a terminus sequenceof the nucleic acids; (c) expressing in vitro in a second microvolume aplurality of proteins from the population of nucleic acids.
 8. A methodfor producing at least one protein, the method comprising: (a) providinga support having a plurality of distinct features, each featurecomprising a plurality of immobilized anchor oligonucleotides; (b)generating at least one plurality of nucleic acid having a predefinedsequence onto the plurality of anchor oligonucleotides; (c) providing amicrovolume onto at least one feature of the support; and (d) expressingin vitro in the microvolume the at least one protein from the at leastone nucleic acid.
 9. The method of claim 8 wherein each feature of thesupport comprises a distinct plurality of support-bound anchoroligonucleotides, wherein the 5′ end of each of the plurality of anchoroligonucleotide is complementary to the 5′ end of a distinct nucleicacid having a predefined sequence.
 10. The method of claim 8 wherein theplurality of nucleic acids are generated by assembling a plurality ofconstruction oligonucleotides comprising partially overlapping sequencesthat define the sequence of the at least one nucleic acid.
 11. Themethod of claim 10 wherein the at least one nucleic acid is generatedunder (i) ligation conditions, (ii) chain extension conditions, or (iii)chain extension and ligation conditions.
 12. The method of claim 8wherein the microvolume comprises reagents appropriate for expressing invitro the at least one protein from the at least one nucleic acid. 13.The method of claim 8 further comprising verifying the at least onenucleic acid sequence prior to the step of expressing the at least oneprotein.
 14. The method of claim 8 further comprising: synthesizing aplurality of partially overlapping construction oligonucleotides,wherein each construction oligonucleotide is synthesized at a distinctfeature of the support comprising immobilized complementary constructionoligonucleotides; releasing the construction oligonucleotides in atleast one microvolume; and transferring the at least one microvolume toa distinct feature comprising a plurality of anchor oligonucleotides.15. The method of claim 8 wherein at least 1,000 proteins are produced.16. The method of claim 8 wherein the proteins are proteins variants.17. The method of claim 8 further comprising screening the at least oneprotein to identify proteins having a desired characteristic.
 19. Aprotein array comprising: a solid surface having a plurality ofanchoroligonucleotides capable of hybridizing with a plurality ofnucleic acids; and a microvolume covering each of the plurality ofanchor oligonucleotides, the microvolume configured to produce apolypeptide from each of the plurality of nucleic acids.
 20. A proteinarray comprising: (a) a first plurality of features on a support, eachof the first plurality of features comprising a plurality of immobilizedsingle stranded oligonucleotides, wherein the plurality of singlestranded oligonucleotides comprises partially overlapping sequences thatdefine the sequence of each of a plurality of nucleic acid moleculesencoding a plurality of proteins; and (b) a second plurality offeatures, the second plurality of features comprising a plurality ofanchor oligonucleotides having a sequence complementary to a terminussequence of each of the plurality nucleic acids.